Re: [Mesa-dev] [PATCH] clover: fix clBuildProgram Piglit regression

2014-11-03 Thread Francisco Jerez
Tom Stellard  writes:

> On Sun, Nov 02, 2014 at 08:03:31PM +0200, Francisco Jerez wrote:
>> EdB  writes:
>> 
>> > should trigger CL_INVALID_VALUE
>> > if device_list is NULL and num_devices is greater than zero.
>> >
>> > introduced by e5468dfa523be2a7a0d04bb9efcf8ae780957563
>> 
>> Tom, can you just drop the the vector of devices parameter and validate
>> the d_devs/num_devs arguments from validate_build_program_common() by
>> calling objs, as I suggested when I gave my R-b for
>> your commit.
>> 
>
> The reason I kept the vector of devices is because if the
> device list is NULL, then the device list from the context
> need to be used.

No, that's fine, as I said earlier, when you end up using the default
device list, that list doesn't need to be error-checked, so there's no
need to pass it to validate_build_program_common(), you can just pass
the NULL/0 device list arguments.

> I didn't want to duplicate this logic in
> validate_build_program_common(), so I added the allow_empty_tag to the
> API functions instead.
>
> I think EdB's fix is a better solution, what do you think?
>
> -Tom
>
>> Thanks.
>> 
>> > ---
>> >  src/gallium/state_trackers/clover/api/program.cpp | 20 
>> > +++-
>> >  1 file changed, 11 insertions(+), 9 deletions(-)
>> >
>> > diff --git a/src/gallium/state_trackers/clover/api/program.cpp 
>> > b/src/gallium/state_trackers/clover/api/program.cpp
>> > index 64c4a43..dc89730 100644
>> > --- a/src/gallium/state_trackers/clover/api/program.cpp
>> > +++ b/src/gallium/state_trackers/clover/api/program.cpp
>> > @@ -27,7 +27,7 @@ using namespace clover;
>> >  
>> >  namespace {
>> > void validate_build_program_common(const program &prog, cl_uint 
>> > num_devs,
>> > -  const ref_vector &devs,
>> > +  ref_vector &devs,
>> >void (*pfn_notify)(cl_program, void 
>> > *),
>> >void *user_data) {
>> >  
>> > @@ -37,10 +37,14 @@ namespace {
>> >if (prog.kernel_ref_count())
>> >   throw error(CL_INVALID_OPERATION);
>> >  
>> > -  if (any_of([&](const device &dev) {
>> > -   return !count(dev, prog.context().devices());
>> > -}, devs))
>> > - throw error(CL_INVALID_DEVICE);
>> > +  if (!num_devs) {
>> > + devs = prog.context().devices();
>> > +  } else {
>> > + if (any_of([&](const device &dev) {
>> > +  return !count(dev, prog.context().devices());
>> > +   }, devs))
>> > +throw error(CL_INVALID_DEVICE);
>> > +  }
>> > }
>> >  }
>> >  
>> > @@ -173,8 +177,7 @@ clBuildProgram(cl_program d_prog, cl_uint num_devs,
>> > void (*pfn_notify)(cl_program, void *),
>> > void *user_data) try {
>> > auto &prog = obj(d_prog);
>> > -   auto devs = (d_devs ? objs(d_devs, num_devs) :
>> > -ref_vector(prog.context().devices()));
>> > +   auto devs = objs(d_devs, num_devs);
>> > auto opts = (p_opts ? p_opts : "");
>> >  
>> > validate_build_program_common(prog, num_devs, devs, pfn_notify, 
>> > user_data);
>> > @@ -195,8 +198,7 @@ clCompileProgram(cl_program d_prog, cl_uint num_devs,
>> >   void (*pfn_notify)(cl_program, void *),
>> >   void *user_data) try {
>> > auto &prog = obj(d_prog);
>> > -   auto devs = (d_devs ? objs(d_devs, num_devs) :
>> > -ref_vector(prog.context().devices()));
>> > +   auto devs = objs(d_devs, num_devs);
>> > auto opts = (p_opts ? p_opts : "");
>> > header_map headers;
>> >  
>> > -- 
>> > 1.9.3
>> >
>> > ___
>> > mesa-dev mailing list
>> > mesa-dev@lists.freedesktop.org
>> > http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
>
>
>
>> ___
>> mesa-dev mailing list
>> mesa-dev@lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/mesa-dev


pgpw7msOulz6N.pgp
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH][RFC] mesa/main: Clamp rgba with streamed sse

2014-11-03 Thread Juha-Pekka Heikkila
On 31.10.2014 20:30, Roland Scheidegger wrote:
> Am 31.10.2014 um 18:17 schrieb Matt Turner:
>> On Fri, Oct 31, 2014 at 3:13 AM, Juha-Pekka Heikkila
>>  wrote:
>>> Signed-off-by: Juha-Pekka Heikkila 
>>> ---
>>>  src/mesa/main/colormac.h  | 20 +++
>>>  src/mesa/main/pixeltransfer.c | 59 
>>> ---
>>>  2 files changed, 64 insertions(+), 15 deletions(-)
>>>
>>> diff --git a/src/mesa/main/colormac.h b/src/mesa/main/colormac.h
>>> index c8adca6..da5e094 100644
>>> --- a/src/mesa/main/colormac.h
>>> +++ b/src/mesa/main/colormac.h
>>> @@ -51,6 +51,26 @@ _mesa_unclamped_float_rgba_to_ubyte(GLubyte dst[4], 
>>> const GLfloat src[4])
>>>
>>>
>>>  /**
>>> + * Clamp four float values to [min,max]
>>> + */
>>> +#if defined(__SSE2__) && defined(__GNUC__)
>>> +static inline void
>>> +_mesa_clamp_float_rgba(GLfloat src[4], GLfloat result[4], const float min,
>>> +   const float max)
>>> +{
>>> +__m128  operand, minval, maxval;
>>> +
>>> +operand = _mm_loadu_ps(src);
>>
>> Surely 128-bit pixels will be 128-bit aligned? I think we can use an
>> aligned load here.
> I can't see why that would be the case (sure in some cases it might be
> but I don't see anything which would actually guarantee that). temp
> images themselves don't seem to have any forced alignment (that could be
> fixed though I didn't check fi there's other paths into it which can't
> be easily fixed).

On SNB 64-bit build with my standard build options I did see the pixels
aligned at 64-bit at worst thus I have unaligned load/store here.

> 
>>
>>> +minval = _mm_set1_ps(min);
>>> +maxval = _mm_set1_ps(max);
>>> +operand = _mm_max_ps(operand, minval);
>>> +operand = _mm_min_ps(operand, maxval);
>>> +_mm_storeu_ps(result, operand);
>>
>> And an aligned store here.
>>
>>> +}
>>> +#endif
>>> +
>>> +
>>> +/**
>>>   * \name Generic color packing macros.  All inputs should be GLubytes.
>>>   *
>>>   * \todo We may move these into texstore.h at some point.
>>> diff --git a/src/mesa/main/pixeltransfer.c b/src/mesa/main/pixeltransfer.c
>>> index 8bbeeb8..e16eb59 100644
>>> --- a/src/mesa/main/pixeltransfer.c
>>> +++ b/src/mesa/main/pixeltransfer.c
>>> @@ -35,7 +35,7 @@
>>>  #include "pixeltransfer.h"
>>>  #include "imports.h"
>>>  #include "mtypes.h"
>>> -
>>> +#include "x86/common_x86_asm.h"
>>>
>>>  /*
>>>   * Apply scale and bias factors to an array of RGBA pixels.
>>> @@ -89,16 +89,34 @@ _mesa_map_rgba( const struct gl_context *ctx, GLuint n, 
>>> GLfloat rgba[][4] )
>>> const GLfloat *bMap = ctx->PixelMaps.BtoB.Map;
>>> const GLfloat *aMap = ctx->PixelMaps.AtoA.Map;
>>> GLuint i;
>>> -   for (i=0;i>> -  GLfloat r = CLAMP(rgba[i][RCOMP], 0.0F, 1.0F);
>>> -  GLfloat g = CLAMP(rgba[i][GCOMP], 0.0F, 1.0F);
>>> -  GLfloat b = CLAMP(rgba[i][BCOMP], 0.0F, 1.0F);
>>> -  GLfloat a = CLAMP(rgba[i][ACOMP], 0.0F, 1.0F);
>>> -  rgba[i][RCOMP] = rMap[F_TO_I(r * rscale)];
>>> -  rgba[i][GCOMP] = gMap[F_TO_I(g * gscale)];
>>> -  rgba[i][BCOMP] = bMap[F_TO_I(b * bscale)];
>>> -  rgba[i][ACOMP] = aMap[F_TO_I(a * ascale)];
>>> +
>>> +#if defined(__SSE2__) && defined(__GNUC__)
>>> +   if (cpu_has_xmm2) {
>>
>> #ifdef __SSE2__ means the compiler is free to use SSE2 instructions
>> whenever it pleases. That's not what you want here, if you're also
>> doing runtime checking (cpu_has_xmm2).
>>
>> The typical way to do this is to put the function containing SSE
>> instructions in a separate file that is compiled with -msse2. See
>> streaming-load-memcpy.c for example. gcc has a way to mark specific
>> functions, but since we have to compile with MSVC...
>>
>> I think you just want copy-and-paste the SSE4.1 testing code in
>> configure.ac for SSE2 and then wrap these uses in #ifdef USE_SSE2.
>>
>>> +  for (i=0;i>> + GLfloat rgba_temp[4];
>>> + _mesa_clamp_float_rgba(rgba[i], rgba_temp, 0.0F, 1.0F);
>>> + rgba[i][RCOMP] = rMap[F_TO_I(rgba_temp[RCOMP] * rscale)];
>>> + rgba[i][GCOMP] = gMap[F_TO_I(rgba_temp[GCOMP] * gscale)];
>>> + rgba[i][BCOMP] = bMap[F_TO_I(rgba_temp[BCOMP] * bscale)];
>>> + rgba[i][ACOMP] = aMap[F_TO_I(rgba_temp[ACOMP] * ascale)];
>>
>> Oh, but we shouldn't be bothering to store the floats back to memory
>> anyway. We should just do this part with SSE as well.
> 
> Indeed the mul scale and F_TO_I look like obvious candidates for
> vectorization (though the table part unfortunately not).

I'll rework this patch, it'll be a bit different looking with these
suggested changes.

/Juha-Pekka

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH][RFC] mesa/main: Clamp rgba with streamed sse

2014-11-03 Thread Timothy Arceri
On Fri, 2014-10-31 at 17:24 +, Jose Fonseca wrote:
> On 31/10/14 17:01, Matt Turner wrote:
> > On Fri, Oct 31, 2014 at 4:12 AM, Jose Fonseca  wrote:
> >> On 31/10/14 10:13, Juha-Pekka Heikkila wrote:
> >>>
> >>>defined(__SSE2__) && defined(__GNUC__)
> >>
> >>
> >> Instead of duplicate this expression everywhere lets create a
> >> "HAVE_SSE2_INTRIN" define.  Not only this expression is complex, it will
> >> become even more when we updated it for MSVC.
> >
> > Isn't testing __SSE2__ sufficient? Does MSVC not do this?
> >
> > clang/icc/gcc all implement this and all of the _mm_* intrinsics.
> >
> 
> No, __SSE2__ is a GCC-only macro.  It's not defined or needed by MSVC 
> compilers.  And I strongly suspect that Intel compiler probably only 
> defines it for GCC compatibility.
> 
> 
> This is because GCC is quite lame IMO: it can't distinguish between 
> "enabling SSE intrinsics" (ie, allow including emmintrin.h and use the 
> Intel _mm_* instrincis) and emitting SSE2 opcodes own its own accord. 
> That is, when you pass -msse2 to GCC, you're also giving carte blache 
> for GCC to emit SSE2 opcodes for any C code!  Which makes it _very_ hard 
> to have special code paths for SSE1/2/3/4/etc and no SSE.  Since you 
> basically need to compile each path in a different C module, passing 
> different -msse* flags to each.

So does anyone have a suggestion how this can be better organised? As in
should there be an SSE folder somewhere?

Currently streaming-load-memcpy.c is in mesa/main even though its only
used by the intel driver, also my patch adds another file there and I've
also noticed this [1] which should be made to use a runtime switch too. 

Dumping everything in Mesa main would obviously get messy fast.

[1]
http://cgit.freedesktop.org/mesa/mesa/tree/src/mesa/drivers/dri/i965/intel_tex_subimage.c#n199


> 
> Whereas on MSVC, you can #include emmintrin any time, any where, and 
> only the code that uses the intrinsics will generate those opcodes.  So 
> you can have a awesomeFuncionC(), awesomeFunctionSSE2(), 
> awesomeFunctionAVX() all next to each other, and a switch table to jump 
> into them.
> 
> 
> In other words, on MSVC, instead of
> 
>#if defined(__SSE2__) && defined(__GNUC__)
> 
> all you need is
> 
>#if 1
> 
> or
> 
>#if defined(_M_IX86) ||  defined(_M_X64)
> 
> if you want the code not to cause problems when targetting non-x86 
> architectures.
> 
> 
> 
> Of course there's some merit in GCC emiting SSE instructions for plain C 
> code, but let's face it: virtually all the code that can benefit from 
> SIMD is too complex to be auto-vectorized by compilers, and need humans 
> writing code with SSE intrincs.  So GCC is effectively tailored to make 
> the rare thing easy, at the expense of making the common thing hard...
> 
> 
> I believe recent GCC versions have better support for having specialized 
> SSE code side-by-side. But from what I remember of it, is all pretty 
> non-standard and GCC specific, so still pretty useless for portable code.
> 
> 
> Jose
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] draw: allow LLVM use on non-SSE2 X86 cpus

2014-11-03 Thread Roland Scheidegger
Am 01.11.2014 um 22:19 schrieb David Heidelberg:
> 
> This patch remove workaround related to LLVM < 3.2 bug.
> 
> Original bug has been closed as fixed in 2011.
> At this moment gallium requires LLVM 3.3 (2013).
> 
> LLVM has been tested without SSE2 support in commit
> ca70de9bd20bc4a11b2d2d368e0cc1f49527a947 and removed after requiring
> LLVM 3.3 in commit 013ff2fae13da41c2f5619c4698b0a7b5aa6a06d
> 
> Original LLVM bug:
> https://urldefense.proofpoint.com/v2/url?u=http-3A__llvm.org_bugs_show-5Fbug.cgi-3Fid-3D6960&d=AAICaQ&c=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEs&r=Vjtt0vs_iqoI31UfJxBl7yv9I2FeiaeAYgMTLKRBc_I&m=qfBmuPPumlIuBoEvT3-IykV-twtBIpmnxI7U39p4IiI&s=2w4h6tFrr43JTzz-68ic1hmfifRNxlzhem0QWysUXD4&e=
> 
> Signed-off-by: David Heidelberg 
> ---
>  src/gallium/auxiliary/draw/draw_context.c | 15 +--
>  1 file changed, 1 insertion(+), 14 deletions(-)
> 

Reviewed-by: Roland Scheidegger 

Though at this point I'll have to wonder why you'd actually want to use
it with a non-sse2 capable cpu...

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] Set llvmpipe and softpipe note only for MSAA.

2014-11-03 Thread Roland Scheidegger
Am 02.11.2014 um 18:35 schrieb Romain Failliot:
> Hi!
> 
> Sorry if I'm doing this wrong, first time here. I've tried git
> send-mail, but I don't have an SMTP server so it wasn't working. Here is
> the simple commit (and the patch attached):
> 
> Set llvmpipe and softpipe note only for MSAA.
>
> Right now, in mesamatrix.net
> ,
> the footnote is set so that it seems to be
> for all the features, while actually it seems to be only for the MSAA.
> 


I'm ok with that change. Well in the txt file it seemed to make sense to
add that as a footnote for GL 3.0 (because without it you're not truly
GL 3.0 compliant) but I see this causes problems with parsing.

Roland

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] draw: allow LLVM use on non-SSE2 X86 cpus

2014-11-03 Thread david

On 2014-11-03 11:51, Roland Scheidegger wrote:

Am 01.11.2014 um 22:19 schrieb David Heidelberg:


This patch remove workaround related to LLVM < 3.2 bug.

Original bug has been closed as fixed in 2011.
At this moment gallium requires LLVM 3.3 (2013).

LLVM has been tested without SSE2 support in commit
ca70de9bd20bc4a11b2d2d368e0cc1f49527a947 and removed after requiring
LLVM 3.3 in commit 013ff2fae13da41c2f5619c4698b0a7b5aa6a06d

Original LLVM bug:
https://urldefense.proofpoint.com/v2/url?u=http-3A__llvm.org_bugs_show-5Fbug.cgi-3Fid-3D6960&d=AAICaQ&c=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEs&r=Vjtt0vs_iqoI31UfJxBl7yv9I2FeiaeAYgMTLKRBc_I&m=qfBmuPPumlIuBoEvT3-IykV-twtBIpmnxI7U39p4IiI&s=2w4h6tFrr43JTzz-68ic1hmfifRNxlzhem0QWysUXD4&e=

Signed-off-by: David Heidelberg 
---
 src/gallium/auxiliary/draw/draw_context.c | 15 +--
 1 file changed, 1 insertion(+), 14 deletions(-)



Reviewed-by: Roland Scheidegger 

Though at this point I'll have to wonder why you'd actually want to use
it with a non-sse2 capable cpu...


I don't have non-sse2 cpu. While cleaning Nine patches I just noticed 
there is some deprecated workaround :)


Thank you for reviewing.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 85799] segfault since glsl: Drop constant 0.0 components from dot products

2014-11-03 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=85799

Bug ID: 85799
   Summary: segfault since glsl: Drop constant 0.0 components from
dot products
   Product: Mesa
   Version: git
  Hardware: Other
OS: All
Status: NEW
  Severity: normal
  Priority: medium
 Component: Mesa core
  Assignee: mesa-dev@lists.freedesktop.org
  Reporter: adf.li...@gmail.com

Running radeonsi with git llvm.

Unreal Elemental demo is segfaulting for me since 

d056863b3c535aeebfe5fcfc9468eb33a06ddb60 is the first bad commit
commit d056863b3c535aeebfe5fcfc9468eb33a06ddb60
Author: Matt Turner 
Date:   Fri Oct 17 20:32:58 2014 -0700

glsl: Drop constant 0.0 components from dot products.

Helps a small number of vertex shaders in the games Dungeon Defenders
and Shank, as well as an internal benchmark.

instructions in affected programs: 2801 -> 2719 (-2.93%)



Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fffed2d3700 (LWP 20246)]
0x71687717 in emit_dp (elements=, src1=..., src0=...,
dst=..., ir=0x7fffc874a238, this=0x7fffc86196c0) at
../../src/mesa/state_tracker/st_glsl_to_tgsi.cpp:721
721return emit(ir, dot_opcodes[elements - 2], dst, src0, src1);
(gdb) bt
#0  0x71687717 in emit_dp (elements=, src1=...,
src0=..., dst=..., ir=0x7fffc874a238, this=0x7fffc86196c0) at
../../src/mesa/state_tracker/st_glsl_to_tgsi.cpp:721
#1  glsl_to_tgsi_visitor::visit (this=0x7fffc86196c0, ir=0x7fffc874a238) at
../../src/mesa/state_tracker/st_glsl_to_tgsi.cpp:1755
#2  0x716857e9 in glsl_to_tgsi_visitor::visit (this=0x7fffc86196c0,
ir=0x7fffc874a798) at ../../src/mesa/state_tracker/st_glsl_to_tgsi.cpp:1315
#3  0x716857e9 in glsl_to_tgsi_visitor::visit (this=0x7fffc86196c0,
ir=0x7fffc86d4d38) at ../../src/mesa/state_tracker/st_glsl_to_tgsi.cpp:1315
#4  0x716857e9 in glsl_to_tgsi_visitor::visit (this=0x7fffc86196c0,
ir=0x7fffc86d4db8) at ../../src/mesa/state_tracker/st_glsl_to_tgsi.cpp:1315
#5  0x716857e9 in glsl_to_tgsi_visitor::visit (this=0x7fffc86196c0,
ir=0x7fffc8601da8) at ../../src/mesa/state_tracker/st_glsl_to_tgsi.cpp:1315
#6  0x716857e9 in glsl_to_tgsi_visitor::visit (this=0x7fffc86196c0,
ir=0x7fffc86022f8) at ../../src/mesa/state_tracker/st_glsl_to_tgsi.cpp:1315
#7  0x716857e9 in glsl_to_tgsi_visitor::visit (this=0x7fffc86196c0,
ir=0x7fffc8602848) at ../../src/mesa/state_tracker/st_glsl_to_tgsi.cpp:1315
#8  0x7168e1d3 in glsl_to_tgsi_visitor::visit (this=0x7fffc86196c0,
ir=0x7fffc8602988) at ../../src/mesa/state_tracker/st_glsl_to_tgsi.cpp:2405
#9  0x7167d8bc in glsl_to_tgsi_visitor::visit (this=0x7fffc86196c0,
ir=) at ../../src/mesa/state_tracker/st_glsl_to_tgsi.cpp:1193
#10 0x7171ae55 in visit_exec_list (list=,
visitor=visitor@entry=0x7fffc86196c0) at ../../src/glsl/ir.cpp:1792
#11 0x71698bf4 in get_mesa_program (shader=0x7fffc8653c98,
shader_program=0x7fffc86dd5d8, ctx=0x4a8fc00) at
../../src/mesa/state_tracker/st_glsl_to_tgsi.cpp:5225
#12 st_link_shader (ctx=0x4a8fc00, prog=0x7fffc86dd5d8) at
../../src/mesa/state_tracker/st_glsl_to_tgsi.cpp:5431
#13 0x716ad686 in _mesa_glsl_link_shader (ctx=0x4a8fc00,
prog=0x7fffc86dd5d8) at ../../src/mesa/program/ir_to_mesa.cpp:3038
#14 0x715ff14a in link_program (ctx=0x4a8fc00, program=)
at ../../src/mesa/main/shaderapi.c:917

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH][RFC] mesa/main: Clamp rgba with streamed sse

2014-11-03 Thread Bruno Jimenez
On Mon, 2014-11-03 at 20:39 +1100, Timothy Arceri wrote:
> On Fri, 2014-10-31 at 17:24 +, Jose Fonseca wrote:
> > On 31/10/14 17:01, Matt Turner wrote:
> > > On Fri, Oct 31, 2014 at 4:12 AM, Jose Fonseca  wrote:
> > >> On 31/10/14 10:13, Juha-Pekka Heikkila wrote:
> > >>>
> > >>>defined(__SSE2__) && defined(__GNUC__)
> > >>
> > >>
> > >> Instead of duplicate this expression everywhere lets create a
> > >> "HAVE_SSE2_INTRIN" define.  Not only this expression is complex, it will
> > >> become even more when we updated it for MSVC.
> > >
> > > Isn't testing __SSE2__ sufficient? Does MSVC not do this?
> > >
> > > clang/icc/gcc all implement this and all of the _mm_* intrinsics.
> > >
> > 
> > No, __SSE2__ is a GCC-only macro.  It's not defined or needed by MSVC 
> > compilers.  And I strongly suspect that Intel compiler probably only 
> > defines it for GCC compatibility.
> > 
> > 
> > This is because GCC is quite lame IMO: it can't distinguish between 
> > "enabling SSE intrinsics" (ie, allow including emmintrin.h and use the 
> > Intel _mm_* instrincis) and emitting SSE2 opcodes own its own accord. 
> > That is, when you pass -msse2 to GCC, you're also giving carte blache 
> > for GCC to emit SSE2 opcodes for any C code!  Which makes it _very_ hard 
> > to have special code paths for SSE1/2/3/4/etc and no SSE.  Since you 
> > basically need to compile each path in a different C module, passing 
> > different -msse* flags to each.
> 
> So does anyone have a suggestion how this can be better organised? As in
> should there be an SSE folder somewhere?
> 
> Currently streaming-load-memcpy.c is in mesa/main even though its only
> used by the intel driver, also my patch adds another file there and I've
> also noticed this [1] which should be made to use a runtime switch too. 
> 
> Dumping everything in Mesa main would obviously get messy fast.
> 
> [1]
> http://cgit.freedesktop.org/mesa/mesa/tree/src/mesa/drivers/dri/i965/intel_tex_subimage.c#n199

Hi,

FWIW, my opinion is that maybe we should move this kind of code to
utils/sse or something like that. After all, this is some utility code
that could be used anywhere.

- Bruno
> 
> 
> > 
> > Whereas on MSVC, you can #include emmintrin any time, any where, and 
> > only the code that uses the intrinsics will generate those opcodes.  So 
> > you can have a awesomeFuncionC(), awesomeFunctionSSE2(), 
> > awesomeFunctionAVX() all next to each other, and a switch table to jump 
> > into them.
> > 
> > 
> > In other words, on MSVC, instead of
> > 
> >#if defined(__SSE2__) && defined(__GNUC__)
> > 
> > all you need is
> > 
> >#if 1
> > 
> > or
> > 
> >#if defined(_M_IX86) ||  defined(_M_X64)
> > 
> > if you want the code not to cause problems when targetting non-x86 
> > architectures.
> > 
> > 
> > 
> > Of course there's some merit in GCC emiting SSE instructions for plain C 
> > code, but let's face it: virtually all the code that can benefit from 
> > SIMD is too complex to be auto-vectorized by compilers, and need humans 
> > writing code with SSE intrincs.  So GCC is effectively tailored to make 
> > the rare thing easy, at the expense of making the common thing hard...
> > 
> > 
> > I believe recent GCC versions have better support for having specialized 
> > SSE code side-by-side. But from what I remember of it, is all pretty 
> > non-standard and GCC specific, so still pretty useless for portable code.
> > 
> > 
> > Jose
> > ___
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > http://lists.freedesktop.org/mailman/listinfo/mesa-dev
> 
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] clover: Fix clBuildProgram piglit regression

2014-11-03 Thread Tom Stellard
Should trigger CL_INVALID_VALUE if device_list is NULL and num_devices
is greater than zero.

Introduced by e5468dfa523be2a7a0d04bb9efcf8ae780957563

Reported by: EdB
---

Hi Francisco,

I understand what you are saying now about why we don't need to pass the
vector of devices.  It's because the device vector from the context
doesn't need to be validated so passing the list of device_ids is fine.

Here is an updated patch to fix the regression, which is more in line
with what you suggested before.

-Tom

 src/gallium/state_trackers/clover/api/program.cpp | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/src/gallium/state_trackers/clover/api/program.cpp 
b/src/gallium/state_trackers/clover/api/program.cpp
index 64c4a43..3a6c054 100644
--- a/src/gallium/state_trackers/clover/api/program.cpp
+++ b/src/gallium/state_trackers/clover/api/program.cpp
@@ -27,7 +27,7 @@ using namespace clover;
 
 namespace {
void validate_build_program_common(const program &prog, cl_uint num_devs,
-  const ref_vector &devs,
+  const cl_device_id *d_devs,
   void (*pfn_notify)(cl_program, void *),
   void *user_data) {
 
@@ -39,7 +39,7 @@ namespace {
 
   if (any_of([&](const device &dev) {
return !count(dev, prog.context().devices());
-}, devs))
+}, objs(d_devs, num_devs)))
  throw error(CL_INVALID_DEVICE);
}
 }
@@ -177,7 +177,7 @@ clBuildProgram(cl_program d_prog, cl_uint num_devs,
 ref_vector(prog.context().devices()));
auto opts = (p_opts ? p_opts : "");
 
-   validate_build_program_common(prog, num_devs, devs, pfn_notify, user_data);
+   validate_build_program_common(prog, num_devs, d_devs, pfn_notify, 
user_data);
 
prog.build(devs, opts);
return CL_SUCCESS;
@@ -200,7 +200,7 @@ clCompileProgram(cl_program d_prog, cl_uint num_devs,
auto opts = (p_opts ? p_opts : "");
header_map headers;
 
-   validate_build_program_common(prog, num_devs, devs, pfn_notify, user_data);
+   validate_build_program_common(prog, num_devs, d_devs, pfn_notify, 
user_data);
 
if (bool(num_headers) != bool(header_names))
   throw error(CL_INVALID_VALUE);
-- 
1.8.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] clover: Fix clBuildProgram piglit regression

2014-11-03 Thread Francisco Jerez
Tom Stellard  writes:

> Should trigger CL_INVALID_VALUE if device_list is NULL and num_devices
> is greater than zero.
>
> Introduced by e5468dfa523be2a7a0d04bb9efcf8ae780957563
>
> Reported by: EdB
> ---
>
> Hi Francisco,
>
> I understand what you are saying now about why we don't need to pass the
> vector of devices.  It's because the device vector from the context
> doesn't need to be validated so passing the list of device_ids is fine.
>
> Here is an updated patch to fix the regression, which is more in line
> with what you suggested before.
>
> -Tom

Looks good, thanks,
Reviewed-by: Francisco Jerez 

>
>  src/gallium/state_trackers/clover/api/program.cpp | 8 
>  1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/src/gallium/state_trackers/clover/api/program.cpp 
> b/src/gallium/state_trackers/clover/api/program.cpp
> index 64c4a43..3a6c054 100644
> --- a/src/gallium/state_trackers/clover/api/program.cpp
> +++ b/src/gallium/state_trackers/clover/api/program.cpp
> @@ -27,7 +27,7 @@ using namespace clover;
>  
>  namespace {
> void validate_build_program_common(const program &prog, cl_uint num_devs,
> -  const ref_vector &devs,
> +  const cl_device_id *d_devs,
>void (*pfn_notify)(cl_program, void *),
>void *user_data) {
>  
> @@ -39,7 +39,7 @@ namespace {
>  
>if (any_of([&](const device &dev) {
> return !count(dev, prog.context().devices());
> -}, devs))
> +}, objs(d_devs, num_devs)))
>   throw error(CL_INVALID_DEVICE);
> }
>  }
> @@ -177,7 +177,7 @@ clBuildProgram(cl_program d_prog, cl_uint num_devs,
>  ref_vector(prog.context().devices()));
> auto opts = (p_opts ? p_opts : "");
>  
> -   validate_build_program_common(prog, num_devs, devs, pfn_notify, 
> user_data);
> +   validate_build_program_common(prog, num_devs, d_devs, pfn_notify, 
> user_data);
>  
> prog.build(devs, opts);
> return CL_SUCCESS;
> @@ -200,7 +200,7 @@ clCompileProgram(cl_program d_prog, cl_uint num_devs,
> auto opts = (p_opts ? p_opts : "");
> header_map headers;
>  
> -   validate_build_program_common(prog, num_devs, devs, pfn_notify, 
> user_data);
> +   validate_build_program_common(prog, num_devs, d_devs, pfn_notify, 
> user_data);
>  
> if (bool(num_headers) != bool(header_names))
>throw error(CL_INVALID_VALUE);
> -- 
> 1.8.5.5


pgpBO3mXTUfwH.pgp
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] i965/disasm: Disassemble tdr and tm registers properly.

2014-11-03 Thread Matt Turner
---
 src/mesa/drivers/dri/i965/brw_disasm.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_disasm.c 
b/src/mesa/drivers/dri/i965/brw_disasm.c
index 53ec767..a0f6d57 100644
--- a/src/mesa/drivers/dri/i965/brw_disasm.c
+++ b/src/mesa/drivers/dri/i965/brw_disasm.c
@@ -681,6 +681,12 @@ reg(FILE *file, unsigned _reg_file, unsigned _reg_nr)
  string(file, "ip");
  return -1;
  break;
+  case BRW_ARF_TDR:
+ format(file, "tdr");
+ return -1;
+  case BRW_ARF_TIMESTAMP:
+ format(file, "tm%d", _reg_nr & 0x0f);
+ break;
   default:
  format(file, "ARF%d", _reg_nr);
  break;
-- 
2.0.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 85799] segfault since glsl: Drop constant 0.0 components from dot products

2014-11-03 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=85799

Matt Turner  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED
   Assignee|mesa-dev@lists.freedesktop. |matts...@gmail.com
   |org |

--- Comment #1 from Matt Turner  ---
commit 336e76c1439823185d425ebecb849ce38d55c4eb
Author: Matt Turner 
Date:   Fri Oct 31 10:33:17 2014 -0700

glsl: Emit mul instead of dot if only one component left.

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 4/5] i965/fs: Dead code eliminate instructions writing the flag.

2014-11-03 Thread Matt Turner
On Wed, Oct 29, 2014 at 1:10 PM, Matt Turner  wrote:
> Most prominently helps Natural Selection 2, which has a surprising
> number shaders that do very complicated things before drawing black.
>
> instructions in affected programs: 23824 -> 19570 (-17.86%)
> ---
>  .../dri/i965/brw_fs_dead_code_eliminate.cpp| 23 
> +++---
>  1 file changed, 20 insertions(+), 3 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_fs_dead_code_eliminate.cpp 
> b/src/mesa/drivers/dri/i965/brw_fs_dead_code_eliminate.cpp
> index 9cf8d89..414c4a0 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs_dead_code_eliminate.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs_dead_code_eliminate.cpp
> @@ -43,15 +43,16 @@ fs_visitor::dead_code_eliminate()
>
> int num_vars = live_intervals->num_vars;
> BITSET_WORD *live = ralloc_array(NULL, BITSET_WORD, 
> BITSET_WORDS(num_vars));
> +   BITSET_WORD *flag_live = ralloc_array(NULL, BITSET_WORD, 1);
>
> foreach_block (block, cfg) {
>memcpy(live, live_intervals->block_data[block->num].liveout,
>   sizeof(BITSET_WORD) * BITSET_WORDS(num_vars));
> +  memcpy(flag_live, live_intervals->block_data[block->num].flag_liveout,
> + sizeof(BITSET_WORD));
>
>foreach_inst_in_block_reverse(fs_inst, inst, block) {
> - if (inst->dst.file == GRF &&
> - !inst->has_side_effects() &&
> - !inst->writes_flag()) {
> + if (inst->dst.file == GRF && !inst->has_side_effects()) {
>  bool result_live = false;
>
>  if (inst->regs_written == 1) {
> @@ -76,6 +77,13 @@ fs_visitor::dead_code_eliminate()
>  }
>   }
>
> + if (inst->dst.is_null() && inst->writes_flag()) {
> +if (!BITSET_TEST(flag_live, inst->flag_subreg)) {
> +   inst->opcode = BRW_OPCODE_NOP;

progress = true;

> +   continue;
> +}
> + }
> +
>   if (inst->dst.file == GRF) {
>  if (!inst->is_partial_write()) {
> int var = live_intervals->var_from_reg(&inst->dst);
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 6/5] i965/fs: Remove opt_drop_redundant_mov_to_flags().

2014-11-03 Thread Matt Turner
Dead code elimination now handles this.
---
Depends on the previously sent 5 patch series.

 src/mesa/drivers/dri/i965/brw_fs.cpp | 31 ---
 src/mesa/drivers/dri/i965/brw_fs.h   |  1 -
 2 files changed, 32 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index 39c6231..baf9166 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -3414,35 +3414,6 @@ fs_visitor::calculate_register_pressure()
}
 }
 
-/**
- * Look for repeated FS_OPCODE_MOV_DISPATCH_TO_FLAGS and drop the later ones.
- *
- * The needs_unlit_centroid_workaround ends up producing one of these per
- * channel of centroid input, so it's good to clean them up.
- *
- * An assumption here is that nothing ever modifies the dispatched pixels
- * value that FS_OPCODE_MOV_DISPATCH_TO_FLAGS reads from, but the hardware
- * dictates that anyway.
- */
-void
-fs_visitor::opt_drop_redundant_mov_to_flags()
-{
-   bool flag_mov_found[2] = {false};
-
-   foreach_block_and_inst_safe(block, fs_inst, inst, cfg) {
-  if (inst->is_control_flow()) {
- memset(flag_mov_found, 0, sizeof(flag_mov_found));
-  } else if (inst->opcode == FS_OPCODE_MOV_DISPATCH_TO_FLAGS) {
- if (!flag_mov_found[inst->flag_subreg])
-flag_mov_found[inst->flag_subreg] = true;
- else
-inst->remove(block);
-  } else if (inst->writes_flag()) {
- flag_mov_found[inst->flag_subreg] = false;
-  }
-   }
-}
-
 bool
 fs_visitor::run()
 {
@@ -3518,8 +3489,6 @@ fs_visitor::run()
   assign_constant_locations();
   demote_pull_constants();
 
-  opt_drop_redundant_mov_to_flags();
-
 #define OPT(pass, args...) do {\
   pass_num++;  \
   bool this_progress = pass(args); \
diff --git a/src/mesa/drivers/dri/i965/brw_fs.h 
b/src/mesa/drivers/dri/i965/brw_fs.h
index d9150c3..ccfb12d 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.h
+++ b/src/mesa/drivers/dri/i965/brw_fs.h
@@ -434,7 +434,6 @@ public:
bool try_constant_propagate(fs_inst *inst, acp_entry *entry);
bool opt_copy_propagate_local(void *mem_ctx, bblock_t *block,
  exec_list *acp);
-   void opt_drop_redundant_mov_to_flags();
bool opt_register_renaming();
bool register_coalesce();
bool compute_to_mrf();
-- 
2.0.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/2] i965/vec4: Rewrite dead code elimination to use live in/out.

2014-11-03 Thread Matt Turner
Improves 359 shaders by >=10%
 114 shaders by >=20%
  91 shaders by >=30%
  82 shaders by >=40%
  22 shaders by >=50%
   4 shaders by >=60%
   2 shaders by >=80%

total instructions in shared programs: 5505182 -> 5482260 (-0.42%)
instructions in affected programs: 364629 -> 341707 (-6.29%)
---
 src/mesa/drivers/dri/i965/Makefile.sources |   1 +
 src/mesa/drivers/dri/i965/brw_vec4.cpp | 155 ---
 .../dri/i965/brw_vec4_dead_code_eliminate.cpp  | 169 +
 3 files changed, 170 insertions(+), 155 deletions(-)
 create mode 100644 src/mesa/drivers/dri/i965/brw_vec4_dead_code_eliminate.cpp

diff --git a/src/mesa/drivers/dri/i965/Makefile.sources 
b/src/mesa/drivers/dri/i965/Makefile.sources
index 711aabe..10be4f1 100644
--- a/src/mesa/drivers/dri/i965/Makefile.sources
+++ b/src/mesa/drivers/dri/i965/Makefile.sources
@@ -102,6 +102,7 @@ i965_FILES = \
brw_vec4.cpp \
brw_vec4_copy_propagation.cpp \
brw_vec4_cse.cpp \
+   brw_vec4_dead_code_eliminate.cpp \
brw_vec4_generator.cpp \
brw_vec4_gs_visitor.cpp \
brw_vec4_live_variables.cpp \
diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4.cpp
index df589b8..6560351 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp
@@ -411,161 +411,6 @@ vec4_visitor::opt_reduce_swizzle()
return progress;
 }
 
-static bool
-try_eliminate_instruction(vec4_instruction *inst, int new_writemask,
-  const struct brw_context *brw)
-{
-   if (inst->has_side_effects())
-  return false;
-
-   if (new_writemask == 0) {
-  /* Don't dead code eliminate instructions that write to the
-   * accumulator as a side-effect. Instead just set the destination
-   * to the null register to free it.
-   */
-  if (inst->writes_accumulator || inst->writes_flag()) {
- inst->dst = dst_reg(retype(brw_null_reg(), inst->dst.type));
-  } else {
- inst->opcode = BRW_OPCODE_NOP;
-  }
-
-  return true;
-   } else if (inst->dst.writemask != new_writemask) {
-  switch (inst->opcode) {
-  case SHADER_OPCODE_TXF_CMS:
-  case SHADER_OPCODE_GEN4_SCRATCH_READ:
-  case VS_OPCODE_PULL_CONSTANT_LOAD:
-  case VS_OPCODE_PULL_CONSTANT_LOAD_GEN7:
- break;
-  default:
- /* Do not set a writemask on Gen6 for math instructions, those are
-  * executed using align1 mode that does not support a destination 
mask.
-  */
- if (!(brw->gen == 6 && inst->is_math()) && !inst->is_tex()) {
-inst->dst.writemask = new_writemask;
-return true;
- }
-  }
-   }
-
-   return false;
-}
-
-/**
- * Must be called after calculate_live_intervals() to remove unused
- * writes to registers -- register allocation will fail otherwise
- * because something deffed but not used won't be considered to
- * interfere with other regs.
- */
-bool
-vec4_visitor::dead_code_eliminate()
-{
-   bool progress = false;
-   int pc = -1;
-
-   calculate_live_intervals();
-
-   foreach_block_and_inst(block, vec4_instruction, inst, cfg) {
-  pc++;
-
-  bool inst_writes_flag = false;
-  if (inst->dst.file != GRF) {
- if (inst->dst.is_null() && inst->writes_flag()) {
-inst_writes_flag = true;
- } else {
-continue;
- }
-  }
-
-  if (inst->dst.file == GRF) {
- int write_mask = inst->dst.writemask;
-
- for (int c = 0; c < 4; c++) {
-if (write_mask & (1 << c)) {
-   assert(this->virtual_grf_end[inst->dst.reg * 4 + c] >= pc);
-   if (this->virtual_grf_end[inst->dst.reg * 4 + c] == pc) {
-  write_mask &= ~(1 << c);
-   }
-}
- }
-
- progress = try_eliminate_instruction(inst, write_mask, brw) ||
-progress;
-  }
-
-  if (inst->predicate || inst->prev == NULL)
- continue;
-
-  int dead_channels;
-  if (inst_writes_flag) {
-/* Arbitrarily chosen, other than not being an xyzw writemask. */
-#define FLAG_WRITEMASK (1 << 5)
- dead_channels = inst->reads_flag() ? 0 : FLAG_WRITEMASK;
-  } else {
- dead_channels = inst->dst.writemask;
-
- for (int i = 0; i < 3; i++) {
-if (inst->src[i].file != GRF ||
-inst->src[i].reg != inst->dst.reg)
-  continue;
-
-for (int j = 0; j < 4; j++) {
-   int swiz = BRW_GET_SWZ(inst->src[i].swizzle, j);
-   dead_channels &= ~(1 << swiz);
-}
- }
-  }
-
-  foreach_inst_in_block_reverse_starting_from(vec4_instruction, scan_inst,
-  inst, block) {
- if (dead_channels == 0)
-break;
-
- if (inst_writes_flag) {
-if (scan_ins

[Mesa-dev] [PATCH 1/2] i965/vec4: Track liveness of the flag register.

2014-11-03 Thread Matt Turner
---
 .../drivers/dri/i965/brw_vec4_live_variables.cpp   | 28 ++
 .../drivers/dri/i965/brw_vec4_live_variables.h |  5 
 2 files changed, 33 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_vec4_live_variables.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4_live_variables.cpp
index 4c8a2ef..9835069 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_live_variables.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4_live_variables.cpp
@@ -85,6 +85,11 @@ vec4_live_variables::setup_def_use()
}
}
 }
+ if (inst->reads_flag()) {
+if (!BITSET_TEST(bd->flag_def, 0)) {
+   BITSET_SET(bd->flag_use, 0);
+}
+ }
 
 /* Check for unconditional writes to whole registers. These
  * are the things that screen off preceding definitions of a
@@ -101,6 +106,11 @@ vec4_live_variables::setup_def_use()
}
 }
  }
+ if (inst->writes_flag()) {
+if (!BITSET_TEST(bd->flag_use, 0)) {
+   BITSET_SET(bd->flag_def, 0);
+}
+ }
 
 ip++;
   }
@@ -134,6 +144,13 @@ vec4_live_variables::compute_live_variables()
cont = true;
}
 }
+ BITSET_WORD new_livein = (bd->flag_use[0] |
+   (bd->flag_liveout[0] &
+~bd->flag_def[0]));
+ if (new_livein & ~bd->flag_livein[0]) {
+bd->flag_livein[0] |= new_livein;
+cont = true;
+ }
 
 /* Update liveout */
 foreach_list_typed(bblock_link, child_link, link, &block->children) {
@@ -147,6 +164,12 @@ vec4_live_variables::compute_live_variables()
  cont = true;
   }
}
+BITSET_WORD new_liveout = (child_bd->flag_livein[0] &
+   ~bd->flag_liveout[0]);
+if (new_liveout) {
+   bd->flag_liveout[0] |= new_liveout;
+   cont = true;
+}
 }
   }
}
@@ -166,6 +189,11 @@ vec4_live_variables::vec4_live_variables(vec4_visitor *v, 
cfg_t *cfg)
   block_data[i].use = rzalloc_array(mem_ctx, BITSET_WORD, bitset_words);
   block_data[i].livein = rzalloc_array(mem_ctx, BITSET_WORD, bitset_words);
   block_data[i].liveout = rzalloc_array(mem_ctx, BITSET_WORD, 
bitset_words);
+
+  block_data[i].flag_def[0] = 0;
+  block_data[i].flag_use[0] = 0;
+  block_data[i].flag_livein[0] = 0;
+  block_data[i].flag_liveout[0] = 0;
}
 
setup_def_use();
diff --git a/src/mesa/drivers/dri/i965/brw_vec4_live_variables.h 
b/src/mesa/drivers/dri/i965/brw_vec4_live_variables.h
index 6f736be..5e68383 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_live_variables.h
+++ b/src/mesa/drivers/dri/i965/brw_vec4_live_variables.h
@@ -49,6 +49,11 @@ struct block_data {
 
/** Which defs reach the exit point of the block. */
BITSET_WORD *liveout;
+
+   BITSET_WORD flag_def[1];
+   BITSET_WORD flag_use[1];
+   BITSET_WORD flag_livein[1];
+   BITSET_WORD flag_liveout[1];
 };
 
 class vec4_live_variables {
-- 
2.0.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] util: Implement unreachable for MSVC using __assume

2014-11-03 Thread Ian Romanick
From: Ian Romanick 

Based on the description of __assume at:

http://msdn.microsoft.com/en-us/library/1b3fsfxw.aspx

Signed-off-by: Ian Romanick 
Cc: Brian Paul 
---
 src/util/macros.h | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/src/util/macros.h b/src/util/macros.h
index ff37a7d..da5daff 100644
--- a/src/util/macros.h
+++ b/src/util/macros.h
@@ -69,6 +69,12 @@ do {\
assert(!str);\
__builtin_unreachable(); \
 } while (0)
+#elif _MSC_VER >= 1200
+#define unreachable(str)\
+do {\
+   assert(!str);\
+   __assume(0); \
+} while (0)
 #endif
 
 #ifndef unreachable
-- 
1.8.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] mesa: Silence unused parameter warning in check_context_limits in non-debug builds

2014-11-03 Thread Ian Romanick
From: Ian Romanick 

../../src/mesa/main/context.c: In function 'check_context_limits':
../../src/mesa/main/context.c:733:41: warning: unused parameter 'ctx' 
[-Wunused-parameter]

Signed-off-by: Ian Romanick 
---
 src/mesa/main/context.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/src/mesa/main/context.c b/src/mesa/main/context.c
index 7c62dbc..400c158 100644
--- a/src/mesa/main/context.c
+++ b/src/mesa/main/context.c
@@ -732,6 +732,8 @@ _mesa_init_constants(struct gl_constants *consts, gl_api 
api)
 static void
 check_context_limits(struct gl_context *ctx)
 {
+   (void) ctx;
+
/* check that we don't exceed the size of various bitfields */
assert(VARYING_SLOT_MAX <=
  (8 * sizeof(ctx->VertexProgram._Current->Base.OutputsWritten)));
-- 
1.8.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965/disasm: Disassemble tdr and tm registers properly.

2014-11-03 Thread Kenneth Graunke
On Monday, November 03, 2014 11:00:04 AM Matt Turner wrote:
> ---
>  src/mesa/drivers/dri/i965/brw_disasm.c | 6 ++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_disasm.c 
b/src/mesa/drivers/dri/i965/brw_disasm.c
> index 53ec767..a0f6d57 100644
> --- a/src/mesa/drivers/dri/i965/brw_disasm.c
> +++ b/src/mesa/drivers/dri/i965/brw_disasm.c
> @@ -681,6 +681,12 @@ reg(FILE *file, unsigned _reg_file, unsigned _reg_nr)
>   string(file, "ip");
>   return -1;
>   break;
> +  case BRW_ARF_TDR:
> + format(file, "tdr");

Maybe print "tdr0" to match the docs?

> + return -1;
> +  case BRW_ARF_TIMESTAMP:
> + format(file, "tm%d", _reg_nr & 0x0f);

_reg_nr *should* always be 0, unless it's invalid.  Maybe just print "tm0"?

Either way, this is a clear improvement over ARF192!
Reviewed-by: Kenneth Graunke 

> + break;
>default:
>   format(file, "ARF%d", _reg_nr);
>   break;
> 


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 01/10] mesa/main: Pass the data that _mesa_uniform actually wants

2014-11-03 Thread Ian Romanick
From: Ian Romanick 

The GL_ enums were previously used because glsl_types.h couldn't be used
in C code.  That was fixed some time ago (and uniforms.c already
includes glsl_types.h), so this is no longer necessary.

Signed-off-by: Ian Romanick 
---
 src/mesa/main/uniform_query.cpp | 73 ++-
 src/mesa/main/uniforms.c| 96 -
 src/mesa/main/uniforms.h|  4 +-
 3 files changed, 54 insertions(+), 119 deletions(-)

diff --git a/src/mesa/main/uniform_query.cpp b/src/mesa/main/uniform_query.cpp
index fcb14c4..77217cb 100644
--- a/src/mesa/main/uniform_query.cpp
+++ b/src/mesa/main/uniform_query.cpp
@@ -583,12 +583,12 @@ _mesa_propagate_uniforms_to_driver_storage(struct 
gl_uniform_storage *uni,
 extern "C" void
 _mesa_uniform(struct gl_context *ctx, struct gl_shader_program *shProg,
  GLint location, GLsizei count,
-  const GLvoid *values, GLenum type)
+  const GLvoid *values,
+  enum glsl_base_type basicType,
+  unsigned src_components)
 {
unsigned offset;
unsigned components;
-   unsigned src_components;
-   enum glsl_base_type basicType;
 
struct gl_uniform_storage *const uni =
   validate_uniform_parameters(ctx, shProg, location, count,
@@ -598,73 +598,6 @@ _mesa_uniform(struct gl_context *ctx, struct 
gl_shader_program *shProg,
 
/* Verify that the types are compatible.
 */
-   switch (type) {
-   case GL_FLOAT:
-  basicType = GLSL_TYPE_FLOAT;
-  src_components = 1;
-  break;
-   case GL_FLOAT_VEC2:
-  basicType = GLSL_TYPE_FLOAT;
-  src_components = 2;
-  break;
-   case GL_FLOAT_VEC3:
-  basicType = GLSL_TYPE_FLOAT;
-  src_components = 3;
-  break;
-   case GL_FLOAT_VEC4:
-  basicType = GLSL_TYPE_FLOAT;
-  src_components = 4;
-  break;
-   case GL_UNSIGNED_INT:
-  basicType = GLSL_TYPE_UINT;
-  src_components = 1;
-  break;
-   case GL_UNSIGNED_INT_VEC2:
-  basicType = GLSL_TYPE_UINT;
-  src_components = 2;
-  break;
-   case GL_UNSIGNED_INT_VEC3:
-  basicType = GLSL_TYPE_UINT;
-  src_components = 3;
-  break;
-   case GL_UNSIGNED_INT_VEC4:
-  basicType = GLSL_TYPE_UINT;
-  src_components = 4;
-  break;
-   case GL_INT:
-  basicType = GLSL_TYPE_INT;
-  src_components = 1;
-  break;
-   case GL_INT_VEC2:
-  basicType = GLSL_TYPE_INT;
-  src_components = 2;
-  break;
-   case GL_INT_VEC3:
-  basicType = GLSL_TYPE_INT;
-  src_components = 3;
-  break;
-   case GL_INT_VEC4:
-  basicType = GLSL_TYPE_INT;
-  src_components = 4;
-  break;
-   case GL_BOOL:
-   case GL_BOOL_VEC2:
-   case GL_BOOL_VEC3:
-   case GL_BOOL_VEC4:
-   case GL_FLOAT_MAT2:
-   case GL_FLOAT_MAT2x3:
-   case GL_FLOAT_MAT2x4:
-   case GL_FLOAT_MAT3x2:
-   case GL_FLOAT_MAT3:
-   case GL_FLOAT_MAT3x4:
-   case GL_FLOAT_MAT4x2:
-   case GL_FLOAT_MAT4x3:
-   case GL_FLOAT_MAT4:
-   default:
-  _mesa_problem(NULL, "Invalid type in %s", __func__);
-  return;
-   }
-
if (uni->type->is_sampler()) {
   components = 1;
} else {
diff --git a/src/mesa/main/uniforms.c b/src/mesa/main/uniforms.c
index c307107..f7d5e89 100644
--- a/src/mesa/main/uniforms.c
+++ b/src/mesa/main/uniforms.c
@@ -151,7 +151,7 @@ void GLAPIENTRY
 _mesa_Uniform1f(GLint location, GLfloat v0)
 {
GET_CURRENT_CONTEXT(ctx);
-   _mesa_uniform(ctx, ctx->_Shader->ActiveProgram, location, 1, &v0, GL_FLOAT);
+   _mesa_uniform(ctx, ctx->_Shader->ActiveProgram, location, 1, &v0, 
GLSL_TYPE_FLOAT, 1);
 }
 
 void GLAPIENTRY
@@ -161,7 +161,7 @@ _mesa_Uniform2f(GLint location, GLfloat v0, GLfloat v1)
GLfloat v[2];
v[0] = v0;
v[1] = v1;
-   _mesa_uniform(ctx, ctx->_Shader->ActiveProgram, location, 1, v, 
GL_FLOAT_VEC2);
+   _mesa_uniform(ctx, ctx->_Shader->ActiveProgram, location, 1, v, 
GLSL_TYPE_FLOAT, 2);
 }
 
 void GLAPIENTRY
@@ -172,7 +172,7 @@ _mesa_Uniform3f(GLint location, GLfloat v0, GLfloat v1, 
GLfloat v2)
v[0] = v0;
v[1] = v1;
v[2] = v2;
-   _mesa_uniform(ctx, ctx->_Shader->ActiveProgram, location, 1, v, 
GL_FLOAT_VEC3);
+   _mesa_uniform(ctx, ctx->_Shader->ActiveProgram, location, 1, v, 
GLSL_TYPE_FLOAT, 3);
 }
 
 void GLAPIENTRY
@@ -185,14 +185,14 @@ _mesa_Uniform4f(GLint location, GLfloat v0, GLfloat v1, 
GLfloat v2,
v[1] = v1;
v[2] = v2;
v[3] = v3;
-   _mesa_uniform(ctx, ctx->_Shader->ActiveProgram, location, 1, v, 
GL_FLOAT_VEC4);
+   _mesa_uniform(ctx, ctx->_Shader->ActiveProgram, location, 1, v, 
GLSL_TYPE_FLOAT, 4);
 }
 
 void GLAPIENTRY
 _mesa_Uniform1i(GLint location, GLint v0)
 {
GET_CURRENT_CONTEXT(ctx);
-   _mesa_uniform(ctx, ctx->_Shader->ActiveProgram, location, 1, &v0, GL_INT);
+   _mesa_uniform(ctx, ctx->_Shader->ActiveProgram, location, 1, &v0, 
GLSL_TYPE_INT, 1);
 }
 
 void GLAPIENTRY
@@ -202,7 +202,7 @@ _mesa_Uniform2i(GLint location, GLint v0, GLint v1)
GLint v[2];
v[0] = v0;
v[1] = v1;

[Mesa-dev] [PATCH 02/10] mesa: Remove GLSL_TYPE_SAMPLER check

2014-11-03 Thread Ian Romanick
From: Ian Romanick 

Noting the assertion just a few lines earlier, returnType cannot be
GLSL_TYPE_SAMPLER.

Signed-off-by: Ian Romanick 
---
 src/mesa/main/uniform_query.cpp | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/src/mesa/main/uniform_query.cpp b/src/mesa/main/uniform_query.cpp
index 77217cb..aefa8b8 100644
--- a/src/mesa/main/uniform_query.cpp
+++ b/src/mesa/main/uniform_query.cpp
@@ -342,8 +342,7 @@ _mesa_get_uniform(struct gl_context *ctx, GLuint program, 
GLint location,
*/
   if (returnType == uni->type->base_type
  || ((returnType == GLSL_TYPE_INT
-  || returnType == GLSL_TYPE_UINT
-  || returnType == GLSL_TYPE_SAMPLER)
+  || returnType == GLSL_TYPE_UINT)
  &&
  (uni->type->base_type == GLSL_TYPE_INT
   || uni->type->base_type == GLSL_TYPE_UINT
-- 
1.8.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 05/10] mesa: Get some gl_shader_program::LinkStatus checking out of the main path

2014-11-03 Thread Ian Romanick
From: Ian Romanick 

I really wanted to remove 'shProg != NULL' as well, but that would have
required adding a dummy program as the default program.  That seemed
like more churn than removing one test was worth.

Signed-off-by: Ian Romanick 
---
 src/mesa/main/uniform_query.cpp | 25 +++--
 1 file changed, 19 insertions(+), 6 deletions(-)

diff --git a/src/mesa/main/uniform_query.cpp b/src/mesa/main/uniform_query.cpp
index a1ca367..16e08d4 100644
--- a/src/mesa/main/uniform_query.cpp
+++ b/src/mesa/main/uniform_query.cpp
@@ -178,7 +178,7 @@ validate_uniform_parameters(struct gl_context *ctx,
unsigned *array_index,
const char *caller)
 {
-   if (!shProg || !shProg->LinkStatus) {
+   if (shProg == NULL) {
   _mesa_error(ctx, GL_INVALID_OPERATION, "%s(program not linked)", caller);
   return NULL;
}
@@ -193,15 +193,28 @@ validate_uniform_parameters(struct gl_context *ctx,
   return NULL;
}
 
-   /* Check that the given location is in bounds of uniform remap table. */
-   if (location >= (GLint) shProg->NumUniformRemapTable) {
-  _mesa_error(ctx, GL_INVALID_OPERATION, "%s(location=%d)",
-  caller, location);
+   /* Check that the given location is in bounds of uniform remap table.
+* Unlinked programs will have NumUniformRemapTable == 0, so we can take
+* the shProg->LinkStatus check out of the main path.
+*/
+   if (unlikely(location >= (GLint) shProg->NumUniformRemapTable)) {
+  if (!shProg->LinkStatus)
+ _mesa_error(ctx, GL_INVALID_OPERATION, "%s(program not linked)",
+ caller);
+  else
+ _mesa_error(ctx, GL_INVALID_OPERATION, "%s(location=%d)",
+ caller, location);
+
   return NULL;
}
 
-   if (location == -1)
+   if (location == -1) {
+  if (!shProg->LinkStatus)
+ _mesa_error(ctx, GL_INVALID_OPERATION, "%s(program not linked)",
+ caller);
+
   return NULL;
+   }
 
/* Page 82 (page 96 of the PDF) of the OpenGL 2.1 spec says:
 *
-- 
1.8.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 04/10] mesa: Rework location == -1 error checking

2014-11-03 Thread Ian Romanick
From: Ian Romanick 

Only one caller wanted to generate an error when location == -1, so move
the error generation to that caller.  There will be more callers in the
future that do not want to generate errors.

Move the location == -1 check later in validate_uniform_parameters.  As
currently implemented, glUniform1iv(-1, -1, data) would not generate an
error, but it should due to count being < 0.

The location that I have moved it to will make more sense with the next
commit.

Signed-off-by: Ian Romanick 
---
 src/mesa/main/uniform_query.cpp | 76 -
 1 file changed, 38 insertions(+), 38 deletions(-)

diff --git a/src/mesa/main/uniform_query.cpp b/src/mesa/main/uniform_query.cpp
index a6992c7..a1ca367 100644
--- a/src/mesa/main/uniform_query.cpp
+++ b/src/mesa/main/uniform_query.cpp
@@ -176,46 +176,13 @@ validate_uniform_parameters(struct gl_context *ctx,
struct gl_shader_program *shProg,
GLint location, GLsizei count,
unsigned *array_index,
-   const char *caller,
-   bool negative_one_is_not_valid)
+   const char *caller)
 {
if (!shProg || !shProg->LinkStatus) {
   _mesa_error(ctx, GL_INVALID_OPERATION, "%s(program not linked)", caller);
   return NULL;
}
 
-   if (location == -1) {
-  /* For glGetUniform, page 264 (page 278 of the PDF) of the OpenGL 2.1
-   * spec says:
-   *
-   * "The error INVALID_OPERATION is generated if program has not been
-   * linked successfully, or if location is not a valid location for
-   * program."
-   *
-   * For glUniform, page 82 (page 96 of the PDF) of the OpenGL 2.1 spec
-   * says:
-   *
-   * "If the value of location is -1, the Uniform* commands will
-   * silently ignore the data passed in, and the current uniform
-   * values will not be changed."
-   *
-   * Allowing -1 for the location parameter of glUniform allows
-   * applications to avoid error paths in the case that, for example, some
-   * uniform variable is removed by the compiler / linker after
-   * optimization.  In this case, the new value of the uniform is dropped
-   * on the floor.  For the case of glGetUniform, there is nothing
-   * sensible to do for a location of -1.
-   *
-   * The negative_one_is_not_valid flag selects between the two behaviors.
-   */
-  if (negative_one_is_not_valid) {
-_mesa_error(ctx, GL_INVALID_OPERATION, "%s(location=%d)",
-caller, location);
-  }
-
-  return NULL;
-   }
-
/* From page 12 (page 26 of the PDF) of the OpenGL 2.1 spec:
 *
 * "If a negative number is provided where an argument of type sizei or
@@ -233,6 +200,9 @@ validate_uniform_parameters(struct gl_context *ctx,
   return NULL;
}
 
+   if (location == -1)
+  return NULL;
+
/* Page 82 (page 96 of the PDF) of the OpenGL 2.1 spec says:
 *
 * "If any of the following conditions occur, an INVALID_OPERATION
@@ -308,9 +278,39 @@ _mesa_get_uniform(struct gl_context *ctx, GLuint program, 
GLint location,
 
struct gl_uniform_storage *const uni =
   validate_uniform_parameters(ctx, shProg, location, 1,
-  &offset, "glGetUniform", true);
-   if (uni == NULL)
+  &offset, "glGetUniform");
+   if (uni == NULL) {
+  /* For glGetUniform, page 264 (page 278 of the PDF) of the OpenGL 2.1
+   * spec says:
+   *
+   * "The error INVALID_OPERATION is generated if program has not been
+   * linked successfully, or if location is not a valid location for
+   * program."
+   *
+   * For glUniform, page 82 (page 96 of the PDF) of the OpenGL 2.1 spec
+   * says:
+   *
+   * "If the value of location is -1, the Uniform* commands will
+   * silently ignore the data passed in, and the current uniform
+   * values will not be changed."
+   *
+   * Allowing -1 for the location parameter of glUniform allows
+   * applications to avoid error paths in the case that, for example, some
+   * uniform variable is removed by the compiler / linker after
+   * optimization.  In this case, the new value of the uniform is dropped
+   * on the floor.  For the case of glGetUniform, there is nothing
+   * sensible to do for a location of -1.
+   *
+   * If the location was -1, validate_unfirom_parameters will return NULL
+   * without raising an error.  Raise the error here.
+   */
+  if (location == -1) {
+ _mesa_error(ctx, GL_INVALID_OPERATION, "glGetUniform(location=%d)",
+ location);
+  }
+
   return;
+   }
 
{
   unsigned elements = (uni->type->is_sampler())
@@ -590,7 +590,7 @@ _mesa_uniform(struct gl_con

[Mesa-dev] [PATCH 06/10] mesa: Rework array error checks in validate_uniform_parameters

2014-11-03 Thread Ian Romanick
From: Ian Romanick 

Before ARB_explicit_uniform_location, Mesa's location encoding allowed
locations for non-array types that had non-zero array indices.
Basically, part of the location was the uniform and part was the array
index.  This meant that some checks had to occur for arrays and
non-arrays.  This is no longer possible, we the checks can be split up.

Signed-off-by: Ian Romanick 
Cc: Tapani Pälli 
---
 src/mesa/main/uniform_query.cpp | 41 ++---
 1 file changed, 22 insertions(+), 19 deletions(-)

diff --git a/src/mesa/main/uniform_query.cpp b/src/mesa/main/uniform_query.cpp
index 16e08d4..b87dbdf 100644
--- a/src/mesa/main/uniform_query.cpp
+++ b/src/mesa/main/uniform_query.cpp
@@ -252,27 +252,30 @@ validate_uniform_parameters(struct gl_context *ctx,
 
struct gl_uniform_storage *const uni = shProg->UniformRemapTable[location];
 
-   if (uni->array_elements == 0 && count > 1) {
-  _mesa_error(ctx, GL_INVALID_OPERATION,
- "%s(count > 1 for non-array, location=%d)",
- caller, location);
-  return NULL;
-   }
+   if (uni->array_elements == 0) {
+  if (count > 1) {
+ _mesa_error(ctx, GL_INVALID_OPERATION,
+ "%s(count > 1 for non-array, location=%d)",
+ caller, location);
+ return NULL;
+  }
 
-   /* The array index specified by the uniform location is just the uniform
-* location minus the base location of of the uniform.
-*/
-   *array_index = location - uni->remap_location;
+  assert((location - uni->remap_location) == 0);
+  *array_index = 0;
+   } else {
+  /* The array index specified by the uniform location is just the uniform
+   * location minus the base location of of the uniform.
+   */
+  *array_index = location - uni->remap_location;
 
-   /* If the uniform is an array, check that array_index is in bounds.
-* If not an array, check that array_index is zero.
-* array_index is unsigned so no need to check for less than zero.
-*/
-   const unsigned limit = MAX2(uni->array_elements, 1);
-   if (*array_index >= limit) {
-  _mesa_error(ctx, GL_INVALID_OPERATION, "%s(location=%d)",
- caller, location);
-  return NULL;
+  /* If the uniform is an array, check that array_index is in bounds.
+   * array_index is unsigned so no need to check for less than zero.
+   */
+  if (*array_index >= uni->array_elements) {
+ _mesa_error(ctx, GL_INVALID_OPERATION, "%s(location=%d)",
+ caller, location);
+ return NULL;
+  }
}
return uni;
 }
-- 
1.8.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 09/10] glsl: Swap the order of glsl_type::name and ::length

2014-11-03 Thread Ian Romanick
From: Ian Romanick 

On x86-64 this saves 8 bytes of padding in the structure, and this
reduces the size of the structure to 32 bytes.

Signed-off-by: Ian Romanick 
---
 src/glsl/glsl_types.h | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/src/glsl/glsl_types.h b/src/glsl/glsl_types.h
index 096f546..474b129 100644
--- a/src/glsl/glsl_types.h
+++ b/src/glsl/glsl_types.h
@@ -158,13 +158,6 @@ struct glsl_type {
/*@}*/
 
/**
-* Name of the data type
-*
-* Will never be \c NULL.
-*/
-   const char *name;
-
-   /**
 * For \c GLSL_TYPE_ARRAY, this is the length of the array.  For
 * \c GLSL_TYPE_STRUCT or \c GLSL_TYPE_INTERFACE, it is the number of
 * elements in the structure and the number of values pointed to by
@@ -173,6 +166,13 @@ struct glsl_type {
unsigned length;
 
/**
+* Name of the data type
+*
+* Will never be \c NULL.
+*/
+   const char *name;
+
+   /**
 * Subtype of composite data types.
 */
union {
-- 
1.8.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 10/10] mesa: Uniform logging is very, very unlikely

2014-11-03 Thread Ian Romanick
From: Ian Romanick 

Signed-off-by: Ian Romanick 
---
 src/mesa/main/uniform_query.cpp | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/mesa/main/uniform_query.cpp b/src/mesa/main/uniform_query.cpp
index f971ba1..32870d0 100644
--- a/src/mesa/main/uniform_query.cpp
+++ b/src/mesa/main/uniform_query.cpp
@@ -634,7 +634,7 @@ _mesa_uniform(struct gl_context *ctx, struct 
gl_shader_program *shProg,
   return;
}
 
-   if (ctx->_Shader->Flags & GLSL_UNIFORMS) {
+   if (unlikely(ctx->_Shader->Flags & GLSL_UNIFORMS)) {
   log_uniform(values, basicType, components, 1, count,
  false, shProg, location, uni);
}
@@ -846,7 +846,7 @@ _mesa_uniform_matrix(struct gl_context *ctx, struct 
gl_shader_program *shProg,
   }
}
 
-   if (ctx->_Shader->Flags & GLSL_UNIFORMS) {
+   if (unlikely(ctx->_Shader->Flags & GLSL_UNIFORMS)) {
   log_uniform(values, GLSL_TYPE_FLOAT, components, vectors, count,
  bool(transpose), shProg, location, uni);
}
-- 
1.8.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 00/10] glUniform* micro-optimizations, part 1

2014-11-03 Thread Ian Romanick
This is the first, and more minor, batch of micro-optimizations for the
glUniform* paths.  Other than patch 8, these probably aren't going to
make a lot of difference, even on CPU limited applications.

The next batch, which needs a bit more time to finish baking, should
have some more substantial improvements.

 src/glsl/glsl_types.h   |  18 +--
 src/mesa/main/uniform_query.cpp | 260 +++-
 src/mesa/main/uniforms.c|  96 +++
 src/mesa/main/uniforms.h|   4 +-
 4 files changed, 157 insertions(+), 221 deletions(-)
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 08/10] glsl: Store glsl_type::vector_elements and ::matrix_columns as uint8_t

2014-11-03 Thread Ian Romanick
From: Ian Romanick 

Due to the total number of bits used in the bitfield, this does not
increase the size of the structure.

It does, however, reduce the number of instructions required each time
one of these fields is accessed.  To access ::matrix_columns with the
bitfield, three instructions were required:

movzbl 0x9(%rdx),%eax
shr%al
and$0x7,%eax

As a uint8_t, only one instruction is required.

movzbl 0xa(%rdx),%eax

These fields are accessed *a lot*.

Signed-off-by: Ian Romanick 
---
 src/glsl/glsl_types.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/glsl/glsl_types.h b/src/glsl/glsl_types.h
index 6543041..096f546 100644
--- a/src/glsl/glsl_types.h
+++ b/src/glsl/glsl_types.h
@@ -153,8 +153,8 @@ struct glsl_type {
 * these will be 0.
 */
/*@{*/
-   unsigned vector_elements:3; /**< 1, 2, 3, or 4 vector elements. */
-   unsigned matrix_columns:3;  /**< 1, 2, 3, or 4 matrix columns. */
+   uint8_t vector_elements;/**< 1, 2, 3, or 4 vector elements. */
+   uint8_t matrix_columns; /**< 1, 2, 3, or 4 matrix columns. */
/*@}*/
 
/**
-- 
1.8.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 03/10] mesa: Minor clean ups in _mesa_uniform

2014-11-03 Thread Ian Romanick
From: Ian Romanick 

Signed-off-by: Ian Romanick 
---
 src/mesa/main/uniform_query.cpp | 32 +---
 1 file changed, 9 insertions(+), 23 deletions(-)

diff --git a/src/mesa/main/uniform_query.cpp b/src/mesa/main/uniform_query.cpp
index aefa8b8..a6992c7 100644
--- a/src/mesa/main/uniform_query.cpp
+++ b/src/mesa/main/uniform_query.cpp
@@ -587,7 +587,6 @@ _mesa_uniform(struct gl_context *ctx, struct 
gl_shader_program *shProg,
   unsigned src_components)
 {
unsigned offset;
-   unsigned components;
 
struct gl_uniform_storage *const uni =
   validate_uniform_parameters(ctx, shProg, location, count,
@@ -597,11 +596,8 @@ _mesa_uniform(struct gl_context *ctx, struct 
gl_shader_program *shProg,
 
/* Verify that the types are compatible.
 */
-   if (uni->type->is_sampler()) {
-  components = 1;
-   } else {
-  components = uni->type->vector_elements;
-   }
+   const unsigned components = uni->type->is_sampler()
+  ? 1 : uni->type->vector_elements;
 
bool match;
switch (uni->type->base_type) {
@@ -645,9 +641,7 @@ _mesa_uniform(struct gl_context *ctx, struct 
gl_shader_program *shProg,
 * GL_INVALID_VALUE error and ignore the command.
 */
if (uni->type->is_sampler()) {
-  int i;
-
-  for (i = 0; i < count; i++) {
+  for (int i = 0; i < count; i++) {
 const unsigned texUnit = ((unsigned *) values)[i];
 
  /* check that the sampler (tex unit index) is legal */
@@ -662,9 +656,7 @@ _mesa_uniform(struct gl_context *ctx, struct 
gl_shader_program *shProg,
}
 
if (uni->type->is_image()) {
-  int i;
-
-  for (i = 0; i < count; i++) {
+  for (int i = 0; i < count; i++) {
  const int unit = ((GLint *) values)[i];
 
  /* check that the image unit is legal */
@@ -704,9 +696,8 @@ _mesa_uniform(struct gl_context *ctx, struct 
gl_shader_program *shProg,
 (const union gl_constant_value *) values;
   union gl_constant_value *dst = &uni->storage[components * offset];
   const unsigned elems = components * count;
-  unsigned i;
 
-  for (i = 0; i < elems; i++) {
+  for (unsigned i = 0; i < elems; i++) {
 if (basicType == GLSL_TYPE_FLOAT) {
 dst[i].i = src[i].f != 0.0f ? ctx->Const.UniformBooleanTrue : 0;
 } else {
@@ -723,19 +714,16 @@ _mesa_uniform(struct gl_context *ctx, struct 
gl_shader_program *shProg,
 * the changes through.
 */
if (uni->type->is_sampler()) {
-  int i;
-
   bool flushed = false;
-  for (i = 0; i < MESA_SHADER_STAGES; i++) {
+  for (int i = 0; i < MESA_SHADER_STAGES; i++) {
 struct gl_shader *const sh = shProg->_LinkedShaders[i];
- int j;
 
 /* If the shader stage doesn't use the sampler uniform, skip this.
  */
 if (sh == NULL || !uni->sampler[i].active)
continue;
 
- for (j = 0; j < count; j++) {
+ for (int j = 0; j < count; j++) {
 sh->SamplerUnits[uni->sampler[i].index + offset + j] =
((unsigned *) values)[j];
  }
@@ -777,13 +765,11 @@ _mesa_uniform(struct gl_context *ctx, struct 
gl_shader_program *shProg,
 * uniforms to image units present in the shader data structure.
 */
if (uni->type->is_image()) {
-  int i, j;
-
-  for (i = 0; i < MESA_SHADER_STAGES; i++) {
+  for (int i = 0; i < MESA_SHADER_STAGES; i++) {
 if (uni->image[i].active) {
 struct gl_shader *sh = shProg->_LinkedShaders[i];
 
-for (j = 0; j < count; j++)
+for (int j = 0; j < count; j++)
sh->ImageUnits[uni->image[i].index + offset + j] =
   ((GLint *) values)[j];
  }
-- 
1.8.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 07/10] mesa: Don't check for API_OPENGLES in _mesa_uniform_matrix

2014-11-03 Thread Ian Romanick
From: Ian Romanick 

There are no uniforms in OpenGL ES 1.x, so we can't even get to this
code in that API.

Also, reorder the checks.  First check that transpose is true, then
check whether or not that is legal in the current API.  transpose should
never be true in an ES2 context, so this gets one check (the more
expensive one) out of the main path.

Signed-off-by: Ian Romanick 
---
 src/mesa/main/uniform_query.cpp | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/src/mesa/main/uniform_query.cpp b/src/mesa/main/uniform_query.cpp
index b87dbdf..f971ba1 100644
--- a/src/mesa/main/uniform_query.cpp
+++ b/src/mesa/main/uniform_query.cpp
@@ -836,10 +836,10 @@ _mesa_uniform_matrix(struct gl_context *ctx, struct 
gl_shader_program *shProg,
}
 
/* GL_INVALID_VALUE is generated if `transpose' is not GL_FALSE.
-* http://www.khronos.org/opengles/sdk/docs/man/xhtml/glUniform.xml */
-   if (ctx->API == API_OPENGLES
-   || (ctx->API == API_OPENGLES2 && ctx->Version < 30)) {
-  if (transpose) {
+* http://www.khronos.org/opengles/sdk/docs/man/xhtml/glUniform.xml
+*/
+   if (transpose) {
+  if (ctx->API == API_OPENGLES2 && ctx->Version < 30) {
 _mesa_error(ctx, GL_INVALID_VALUE,
 "glUniformMatrix(matrix transpose is not GL_FALSE)");
 return;
-- 
1.8.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 00/10] glUniform* micro-optimizations, part 1

2014-11-03 Thread Brian Paul

On 11/03/2014 05:22 PM, Ian Romanick wrote:

This is the first, and more minor, batch of micro-optimizations for the
glUniform* paths.  Other than patch 8, these probably aren't going to
make a lot of difference, even on CPU limited applications.

The next batch, which needs a bit more time to finish baking, should
have some more substantial improvements.

  src/glsl/glsl_types.h   |  18 +--
  src/mesa/main/uniform_query.cpp | 260 +++-
  src/mesa/main/uniforms.c|  96 +++
  src/mesa/main/uniforms.h|   4 +-
  4 files changed, 157 insertions(+), 221 deletions(-)
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.freedesktop.org_mailman_listinfo_mesa-2Ddev&d=AAIGaQ&c=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEs&r=T0t4QG7chq2ZwJo6wilkFznRSFy-8uDKartPGbomVj8&m=5QAghgTf_gj0id9SlxNAbec1i9mOp1WieQvnUV_AVJU&s=XN6_VuComU2g5HPG-z2STToQO5WoLmqWCV1gNGMuUGc&e=



Looks good to me.

Reviewed-by: Brian Paul 

BTW, in _mesa_uniform() we have two instances of:

   if (uni->type->is_sampler()) {
  ...
   }

   if (uni->type->is_image()) {
  ...
   }

I believe the second 'if' could be 'else if'

Probably no real savings, but it would read better.

-Brian

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 15/15] draw: allow LLVM use on non-SSE2 X86 cpus

2014-11-03 Thread David Heidelberg


This patch remove workaround related to LLVM 2.7 bug.

Original bug has been closed as fixed in 2011.
At this moment gallium requires LLVM 3.3 (2013).

Original LLVM bug: http://llvm.org/bugs/show_bug.cgi?id=6960

Signed-off-by: David Heidelberg 
---
 src/gallium/auxiliary/draw/draw_context.c | 15 +--
 1 file changed, 1 insertion(+), 14 deletions(-)


diff --git a/src/gallium/auxiliary/draw/draw_context.c b/src/gallium/auxiliary/draw/draw_context.c
index b0f4ca2..37b6c5d 100644
--- a/src/gallium/auxiliary/draw/draw_context.c
+++ b/src/gallium/auxiliary/draw/draw_context.c
@@ -53,20 +53,7 @@
 boolean
 draw_get_option_use_llvm(void)
 {
-   static boolean first = TRUE;
-   static boolean value;
-   if (first) {
-  first = FALSE;
-  value = debug_get_bool_option("DRAW_USE_LLVM", TRUE);
-
-#ifdef PIPE_ARCH_X86
-  util_cpu_detect();
-  /* require SSE2 due to LLVM PR6960. XXX Might be fixed by now? */
-  if (!util_cpu_caps.has_sse2)
- value = FALSE;
-#endif
-   }
-   return value;
+   return debug_get_bool_option("DRAW_USE_LLVM", TRUE);
 }
 #else
 boolean

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/1] r600, llvm: Fix mem leak

2014-11-03 Thread Jan Vesely
Signed-off-by: Jan Vesely 
---
 src/gallium/drivers/r600/r600_llvm.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/gallium/drivers/r600/r600_llvm.c 
b/src/gallium/drivers/r600/r600_llvm.c
index c19693a..5f74bf7 100644
--- a/src/gallium/drivers/r600/r600_llvm.c
+++ b/src/gallium/drivers/r600/r600_llvm.c
@@ -888,6 +888,7 @@ unsigned r600_llvm_compile(
 
FREE(binary.code);
FREE(binary.config);
+   FREE(binary.rodata);
 
return r;
 }
-- 
1.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/1] r600: upload implicit arguments even if there are no explicit args

2014-11-03 Thread Jan Vesely
Signed-off-by: Jan Vesely 
---

moreover, the condition is never true now that clover appends dim info

 src/gallium/drivers/r600/evergreen_compute.c | 4 
 1 file changed, 4 deletions(-)

diff --git a/src/gallium/drivers/r600/evergreen_compute.c 
b/src/gallium/drivers/r600/evergreen_compute.c
index 90fdd79..41dc93e 100644
--- a/src/gallium/drivers/r600/evergreen_compute.c
+++ b/src/gallium/drivers/r600/evergreen_compute.c
@@ -295,10 +295,6 @@ void evergreen_compute_upload_input(
struct pipe_box box;
struct pipe_transfer *transfer = NULL;
 
-   if (shader->input_size == 0) {
-   return;
-   }
-
if (!shader->kernel_param) {
/* Add space for the grid dimensions */
shader->kernel_param = (struct r600_resource *)
-- 
1.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] mesa: Don't call _mesa_ClipControl from glPopAttrib when unsupported.

2014-11-03 Thread Kenneth Graunke
Otherwise, calling glPopAttrib on drivers that don't support
ARB_clip_control gives you a GL error, which is surprising at best.

Signed-off-by: Kenneth Graunke 
---
 src/mesa/main/attrib.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/mesa/main/attrib.c b/src/mesa/main/attrib.c
index 5345339..4684615 100644
--- a/src/mesa/main/attrib.c
+++ b/src/mesa/main/attrib.c
@@ -1345,7 +1345,8 @@ _mesa_PopAttrib(void)
if (xform->DepthClamp != ctx->Transform.DepthClamp)
   _mesa_set_enable(ctx, GL_DEPTH_CLAMP,
ctx->Transform.DepthClamp);
-   _mesa_ClipControl(xform->ClipOrigin, xform->ClipDepthMode);
+   if (ctx->Extensions.ARB_clip_control)
+  _mesa_ClipControl(xform->ClipOrigin, xform->ClipDepthMode);
 }
 break;
  case GL_TEXTURE_BIT:
-- 
2.1.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/4] i965: Use ctx->Const.MaxLineWidth when clamping ctx->Line.Width.

2014-11-03 Thread Kenneth Graunke
Rather than hardcoding platform values in every code path, just use the
maximum value we set.

Currently, ctx->Const.LineWidth == 5, which is smaller than the hardware
limit.  But applications shouldn't be using a value larger than we
support anyway.

Signed-off-by: Kenneth Graunke 
---
 src/mesa/drivers/dri/i965/brw_sf_state.c  | 4 ++--
 src/mesa/drivers/dri/i965/gen6_sf_state.c | 3 ++-
 src/mesa/drivers/dri/i965/gen7_sf_state.c | 3 ++-
 src/mesa/drivers/dri/i965/gen8_sf_state.c | 3 ++-
 4 files changed, 8 insertions(+), 5 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_sf_state.c 
b/src/mesa/drivers/dri/i965/brw_sf_state.c
index 7f31bc1..2fecd1e 100644
--- a/src/mesa/drivers/dri/i965/brw_sf_state.c
+++ b/src/mesa/drivers/dri/i965/brw_sf_state.c
@@ -213,8 +213,8 @@ static void upload_sf_unit( struct brw_context *brw )
}
 
/* _NEW_LINE */
-   /* XXX use ctx->Const.Min/MaxLineWidth here */
-   sf->sf6.line_width = CLAMP(ctx->Line.Width, 1.0, 5.0) * (1<<1);
+   sf->sf6.line_width =
+  CLAMP(ctx->Line.Width, 1.0, ctx->Const.MaxLineWidth) * (1<<1);
 
sf->sf6.line_endcap_aa_region_width = 1;
if (ctx->Line.SmoothFlag)
diff --git a/src/mesa/drivers/dri/i965/gen6_sf_state.c 
b/src/mesa/drivers/dri/i965/gen6_sf_state.c
index d0411b0..24d2754 100644
--- a/src/mesa/drivers/dri/i965/gen6_sf_state.c
+++ b/src/mesa/drivers/dri/i965/gen6_sf_state.c
@@ -322,7 +322,8 @@ upload_sf_state(struct brw_context *brw)
 
/* _NEW_LINE */
{
-  uint32_t line_width_u3_7 = U_FIXED(CLAMP(ctx->Line.Width, 0.0, 7.99), 7);
+  uint32_t line_width_u3_7 =
+ U_FIXED(CLAMP(ctx->Line.Width, 0.0, ctx->Const.MaxLineWidth), 7);
   /* TODO: line width of 0 is not allowed when MSAA enabled */
   if (line_width_u3_7 == 0)
  line_width_u3_7 = 1;
diff --git a/src/mesa/drivers/dri/i965/gen7_sf_state.c 
b/src/mesa/drivers/dri/i965/gen7_sf_state.c
index 150a4d3..109b825 100644
--- a/src/mesa/drivers/dri/i965/gen7_sf_state.c
+++ b/src/mesa/drivers/dri/i965/gen7_sf_state.c
@@ -190,7 +190,8 @@ upload_sf_state(struct brw_context *brw)
 
/* _NEW_LINE */
{
-  uint32_t line_width_u3_7 = U_FIXED(CLAMP(ctx->Line.Width, 0.0, 7.99), 7);
+  uint32_t line_width_u3_7 =
+ U_FIXED(CLAMP(ctx->Line.Width, 0.0, ctx->Const.MaxLineWidth), 7);
   /* TODO: line width of 0 is not allowed when MSAA enabled */
   if (line_width_u3_7 == 0)
  line_width_u3_7 = 1;
diff --git a/src/mesa/drivers/dri/i965/gen8_sf_state.c 
b/src/mesa/drivers/dri/i965/gen8_sf_state.c
index 6aa7b4d..6995a6a 100644
--- a/src/mesa/drivers/dri/i965/gen8_sf_state.c
+++ b/src/mesa/drivers/dri/i965/gen8_sf_state.c
@@ -149,7 +149,8 @@ upload_sf(struct brw_context *brw)
dw1 |= GEN6_SF_VIEWPORT_TRANSFORM_ENABLE;
 
/* _NEW_LINE */
-   uint32_t line_width_u3_7 = U_FIXED(CLAMP(ctx->Line.Width, 0.0, 7.99), 7);
+   uint32_t line_width_u3_7 =
+  U_FIXED(CLAMP(ctx->Line.Width, 0.0, ctx->Const.MaxLineWidth), 7);
if (line_width_u3_7 == 0)
   line_width_u3_7 = 1;
if (brw->gen >= 9 || brw->is_cherryview) {
-- 
2.1.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 4/4] i965: Advertise a line width of 40.0 on Cherryview and Skylake.

2014-11-03 Thread Kenneth Graunke
According to the documentation, line widths higher than 40.0 may have
quality problems.  That's already 20 times larger than we've been
exposing, so it seems totally sufficient.

Signed-off-by: Kenneth Graunke 
---
 src/mesa/drivers/dri/i965/brw_context.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/brw_context.c 
b/src/mesa/drivers/dri/i965/brw_context.c
index eaabd43..8b0f391 100644
--- a/src/mesa/drivers/dri/i965/brw_context.c
+++ b/src/mesa/drivers/dri/i965/brw_context.c
@@ -421,7 +421,11 @@ brw_initialize_context_constants(struct brw_context *brw)
 
ctx->Const.MinLineWidth = 1.0;
ctx->Const.MinLineWidthAA = 1.0;
-   if (brw->gen >= 6) {
+   if (brw->gen >= 9 || brw->is_cherryview) {
+  ctx->Const.MaxLineWidth = 40.0;
+  ctx->Const.MaxLineWidthAA = 40.0;
+  ctx->Const.LineWidthGranularity = 0.125;
+   } else if (brw->gen >= 6) {
   ctx->Const.MaxLineWidth = 7.875;
   ctx->Const.MaxLineWidthAA = 7.875;
   ctx->Const.LineWidthGranularity = 0.125;
-- 
2.1.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/4] i965: Set Line Width correctly on Cherryview and Skylake.

2014-11-03 Thread Kenneth Graunke
Line Width moved to DW1 bits 29:12.  It's actually now a U11.7.

Signed-off-by: Kenneth Graunke 
---
 src/mesa/drivers/dri/i965/brw_defines.h   | 1 +
 src/mesa/drivers/dri/i965/gen8_sf_state.c | 6 +-
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/brw_defines.h 
b/src/mesa/drivers/dri/i965/brw_defines.h
index 64ff744..37666b1 100644
--- a/src/mesa/drivers/dri/i965/brw_defines.h
+++ b/src/mesa/drivers/dri/i965/brw_defines.h
@@ -1800,6 +1800,7 @@ enum brw_message_target {
 # define GEN6_SF_SWIZZLE_ENABLE(1 << 21)
 # define GEN6_SF_POINT_SPRITE_UPPERLEFT(0 << 20)
 # define GEN6_SF_POINT_SPRITE_LOWERLEFT(1 << 20)
+# define GEN9_SF_LINE_WIDTH_SHIFT  12 /* U11.7 */
 # define GEN6_SF_URB_ENTRY_READ_LENGTH_SHIFT   11
 # define GEN6_SF_URB_ENTRY_READ_OFFSET_SHIFT   4
 /* DW2 */
diff --git a/src/mesa/drivers/dri/i965/gen8_sf_state.c 
b/src/mesa/drivers/dri/i965/gen8_sf_state.c
index 1d7b932..6aa7b4d 100644
--- a/src/mesa/drivers/dri/i965/gen8_sf_state.c
+++ b/src/mesa/drivers/dri/i965/gen8_sf_state.c
@@ -152,7 +152,11 @@ upload_sf(struct brw_context *brw)
uint32_t line_width_u3_7 = U_FIXED(CLAMP(ctx->Line.Width, 0.0, 7.99), 7);
if (line_width_u3_7 == 0)
   line_width_u3_7 = 1;
-   dw2 |= line_width_u3_7 << GEN6_SF_LINE_WIDTH_SHIFT;
+   if (brw->gen >= 9 || brw->is_cherryview) {
+  dw1 |= line_width_u3_7 << GEN9_SF_LINE_WIDTH_SHIFT;
+   } else {
+  dw2 |= line_width_u3_7 << GEN6_SF_LINE_WIDTH_SHIFT;
+   }
 
if (ctx->Line.SmoothFlag) {
   dw2 |= GEN6_SF_LINE_END_CAP_WIDTH_1_0;
-- 
2.1.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/4] i965: Advertise larger line widths.

2014-11-03 Thread Kenneth Graunke
We've artificially been limiting this to 5 for no particular reason.

On Gen4-5, the limit is [0, 7.5] with a granularity of 0.5 (U3.1).
On Gen6+, the limit is [0, 7.9921875].  Since it's a U3.7, the
granularity should be 0.125 (1/8).

This patch conservatively advertises one granularity smaller than the
hardware's maximum value, just in case there's a problem using the
largest possible value.  On Gen4-5, this is 7.5 - 0.5 = 7.0.  On Gen6+,
this is 8.0 - 0.125 = 7.875.

Signed-off-by: Kenneth Graunke 
---
 src/mesa/drivers/dri/i965/brw_context.c | 12 +---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_context.c 
b/src/mesa/drivers/dri/i965/brw_context.c
index e1a994a..eaabd43 100644
--- a/src/mesa/drivers/dri/i965/brw_context.c
+++ b/src/mesa/drivers/dri/i965/brw_context.c
@@ -421,9 +421,15 @@ brw_initialize_context_constants(struct brw_context *brw)
 
ctx->Const.MinLineWidth = 1.0;
ctx->Const.MinLineWidthAA = 1.0;
-   ctx->Const.MaxLineWidth = 5.0;
-   ctx->Const.MaxLineWidthAA = 5.0;
-   ctx->Const.LineWidthGranularity = 0.5;
+   if (brw->gen >= 6) {
+  ctx->Const.MaxLineWidth = 7.875;
+  ctx->Const.MaxLineWidthAA = 7.875;
+  ctx->Const.LineWidthGranularity = 0.125;
+   } else {
+  ctx->Const.MaxLineWidth = 7.0;
+  ctx->Const.MaxLineWidthAA = 7.0;
+  ctx->Const.LineWidthGranularity = 0.5;
+   }
 
ctx->Const.MinPointSize = 1.0;
ctx->Const.MinPointSizeAA = 1.0;
-- 
2.1.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 0/4] i965: Totally legit line width patches

2014-11-03 Thread Kenneth Graunke
Here are some totally legit line width patches.  I noticed that Cherryview
was setting line width in DW2 of 3DSTATE_SF, when it actually moved to DW1
at a different bit location.  While fixing that, I figured I should update
the clamp value to reflect the new hardware limit...which led me to want
to use ctx->Const.MaxLineWidth as the clamp value.  (There's actually a
TODO comment about this in the original Gen4 code.)  This then led me to
advertise a larger value for line widths.

No Piglit regressions on Gen4-7.5.  Which sounds comforting, except I think
we actually fail most of our line rendering tests anyway, so they wouldn't
have "regressed".  Plus, Piglit doesn't seem to test large line widths
in the first place.

I ran some oglconform tests which at least drew things with larger line
widths, but I'm not sure how to view the output, and they didn't appear
to actually probe values and fail the test when wrong.

Also untested on Broadwell, Cherryview, or Skylake, which is kind of the
point of the series.  This is high quality stuff, folks!

Suggestions of "please write Piglit tests" will be tacitly ignored/filed
away in the "when I get to it (but I'll probably forget first)" file. :)

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 54080] glXQueryDrawable fails with GLXBadDrawable for a Window in direct context

2014-11-03 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=54080

--- Comment #8 from Adam Nielsen  ---
FYI the latest Oculus Rift SDK release hits this bug now.  This means under
Linux, the Rift can only be used with alternatives like the nVidia
closed-source driver.

https://developer.oculusvr.com/forums/viewtopic.php?f=34&t=16664

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [Mesa-announce] Mesa 10.3 release candidate 1

2014-11-03 Thread Ausmus, James
On Thu, Oct 23, 2014 at 11:35 AM, Matt Turner  wrote:
>
> On Sun, Aug 24, 2014 at 11:51 PM, Thierry Vignaud
>  wrote:
> > On 21 August 2014 17:54, Carl Worth  wrote:
> >> I have verified building from the .tar.bz2 file by doing the following
> >> on a Debian (unstable) system:
> >>
> >> tar xjf MesaLib-10.3.0-rc1.tar.bz2
> >> cd Mesa-10.3.0-rc1
> >> ./configure --enable-gallium-llvm
> >> make -j6
> >> make install
> >
> > Unlike previous releases, it builds smoothly with -j4 but fails with
-j24:
> >
> > gmake[4]: Leaving directory
> >
'/home/iurt/rpmbuild/BUILD/Mesa-10.3.0-rc1/build-osmesa/src/mapi/glapi/gen'
> > gmake[4]: Entering directory
> >
'/home/iurt/rpmbuild/BUILD/Mesa-10.3.0-rc1/build-osmesa/src/mapi/glapi/gen'
> >   GEN  ../../../../src/glx/indirect_size.h
> > gmake[4]: Leaving directory
> >
'/home/iurt/rpmbuild/BUILD/Mesa-10.3.0-rc1/build-osmesa/src/mapi/glapi/gen'
> > gmake  all-am
> > gmake[5]: Nothing to be done for 'all-am'.
> > Making all in .
> > gmake[4]: Entering directory
> > '/home/iurt/rpmbuild/BUILD/Mesa-10.3.0-rc1/build-osmesa/src/mapi'
> >   GEN  .libs/install-mesa-links
> > touch: cannot touch '.libs/install-mesa-links': No such file or
directory
> > Makefile:2109: recipe for target '.libs/install-mesa-links' failed
> > gmake[4]: *** [.libs/install-mesa-links] Error 1
> >
> > You've missing deps in make rules that show up when using quite a lot
of cores
>
> I tried to reproduce this today and couldn't.


I am able to reproduce this consistently with -j40 - it bisects to:

commit c3ce1a942f90843ba637e558e990275bc742571c
Author: Matt Turner 
Date:   Thu Aug 14 12:20:12 2014 -0700

mapi: Inline shared-glapi/Makefile.


But that doesn't revert cleanly on master or the 10.3 branch




> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev




--


James Ausmus
Sr. Software Engineer
SSG-OTC ChromeOS Integration
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [Mesa-announce] Mesa 10.3 release candidate 1

2014-11-03 Thread Matt Turner
On Mon, Nov 3, 2014 at 7:35 PM, Ausmus, James  wrote:
> I am able to reproduce this consistently with -j40 - it bisects to:

Thanks. Maybe you could give a little more information, like an error
message or something?
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] mesa: Don't call _mesa_ClipControl from glPopAttrib when unsupported.

2014-11-03 Thread Jason Ekstrand
Reviewed-by: Jason Ekstrand 

On Mon, Nov 3, 2014 at 6:18 PM, Kenneth Graunke 
wrote:

> Otherwise, calling glPopAttrib on drivers that don't support
> ARB_clip_control gives you a GL error, which is surprising at best.
>
> Signed-off-by: Kenneth Graunke 
> ---
>  src/mesa/main/attrib.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/src/mesa/main/attrib.c b/src/mesa/main/attrib.c
> index 5345339..4684615 100644
> --- a/src/mesa/main/attrib.c
> +++ b/src/mesa/main/attrib.c
> @@ -1345,7 +1345,8 @@ _mesa_PopAttrib(void)
> if (xform->DepthClamp != ctx->Transform.DepthClamp)
>_mesa_set_enable(ctx, GL_DEPTH_CLAMP,
> ctx->Transform.DepthClamp);
> -   _mesa_ClipControl(xform->ClipOrigin, xform->ClipDepthMode);
> +   if (ctx->Extensions.ARB_clip_control)
> +  _mesa_ClipControl(xform->ClipOrigin,
> xform->ClipDepthMode);
>  }
>  break;
>   case GL_TEXTURE_BIT:
> --
> 2.1.2
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] i965: Add #defines for Broadwell HiZ workarounds in CACHE_MODE_1.

2014-11-03 Thread Kristian Høgsberg
On Wed, Oct 22, 2014 at 8:58 AM, Kenneth Graunke  wrote:
> This patch adds macros needed for the HiZ PMA stall optimization.
>
> Signed-off-by: Kenneth Graunke 

Reviewed-by: Kristian Høgsberg 

> ---
>  src/mesa/drivers/dri/i965/intel_reg.h | 6 ++
>  1 file changed, 6 insertions(+)
>
> diff --git a/src/mesa/drivers/dri/i965/intel_reg.h 
> b/src/mesa/drivers/dri/i965/intel_reg.h
> index 45b82ad..5ac0180 100644
> --- a/src/mesa/drivers/dri/i965/intel_reg.h
> +++ b/src/mesa/drivers/dri/i965/intel_reg.h
> @@ -138,3 +138,9 @@
>  #define GEN7_3DPRIM_INSTANCE_COUNT  0x2438
>  #define GEN7_3DPRIM_START_INSTANCE  0x243C
>  #define GEN7_3DPRIM_BASE_VERTEX 0x2440
> +
> +#define GEN7_CACHE_MODE_1   0x7004
> +# define GEN8_HIZ_NP_PMA_FIX_ENABLE(1 << 11)
> +# define GEN8_HIZ_NP_EARLY_Z_FAILS_DISABLE (1 << 13)
> +# define GEN8_HIZ_PMA_MASK_BITS \
> +   ((GEN8_HIZ_NP_PMA_FIX_ENABLE | GEN8_HIZ_NP_EARLY_Z_FAILS_DISABLE) << 16)
> --
> 2.1.2
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/3] i965: Implement the PMA stall fix.

2014-11-03 Thread Kristian Høgsberg
On Wed, Oct 22, 2014 at 8:58 AM, Kenneth Graunke  wrote:
> Certain non-promoted depth cases typically incur stalls.  In very
> specific cases, we can enable a workaround which improves performance.
>
> Improves performance in GLBenchmark 2.7 TRex by 1.17762% +/- 0.448765%
> (n=75) at 1280x720 on Broadwell GT3.
>
> Haswell has this feature as well, but we can't currently write registers
> from userspace batches (and we'd incur additional software batch
> scanning overhead as well), so we haven't enabled it.  Broadwell allows
> us to write CACHE_MODE_1.  Backporters beware: the formula and flushing
> incantation differs between Haswell and Broadwell.
>
> Signed-off-by: Kenneth Graunke 
> ---
>  src/mesa/drivers/dri/i965/brw_context.h  |   1 +
>  src/mesa/drivers/dri/i965/brw_state.h|   1 +
>  src/mesa/drivers/dri/i965/brw_state_upload.c |   6 +
>  src/mesa/drivers/dri/i965/gen8_depth_state.c | 170 
> +++
>  4 files changed, 178 insertions(+)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
> b/src/mesa/drivers/dri/i965/brw_context.h
> index 45d72d2..7877aa1 100644
> --- a/src/mesa/drivers/dri/i965/brw_context.h
> +++ b/src/mesa/drivers/dri/i965/brw_context.h
> @@ -1079,6 +1079,7 @@ struct brw_context
> GLuint NewGLState;
> struct {
>struct brw_state_flags dirty;
> +  uint32_t pma_stall_bits;

I don't think I'd put it here, this looks like it's part of the atom
tracking... maybe in depthstencil or just not in a sub-struct?
Anyway, nothing to hold up the patches for.

> } state;
>
> struct brw_cache cache;
> diff --git a/src/mesa/drivers/dri/i965/brw_state.h 
> b/src/mesa/drivers/dri/i965/brw_state.h
> index 2efe56e..209fab1 100644
> --- a/src/mesa/drivers/dri/i965/brw_state.h
> +++ b/src/mesa/drivers/dri/i965/brw_state.h
> @@ -137,6 +137,7 @@ extern const struct brw_tracked_state gen8_disable_stages;
>  extern const struct brw_tracked_state gen8_gs_state;
>  extern const struct brw_tracked_state gen8_index_buffer;
>  extern const struct brw_tracked_state gen8_multisample_state;
> +extern const struct brw_tracked_state gen8_pma_fix;
>  extern const struct brw_tracked_state gen8_ps_blend;
>  extern const struct brw_tracked_state gen8_ps_extra;
>  extern const struct brw_tracked_state gen8_ps_state;
> diff --git a/src/mesa/drivers/dri/i965/brw_state_upload.c 
> b/src/mesa/drivers/dri/i965/brw_state_upload.c
> index a691319..efa870c 100644
> --- a/src/mesa/drivers/dri/i965/brw_state_upload.c
> +++ b/src/mesa/drivers/dri/i965/brw_state_upload.c
> @@ -333,6 +333,7 @@ static const struct brw_tracked_state *gen8_atoms[] =
> &gen8_vertices,
>
> &haswell_cut_index,
> +   &gen8_pma_fix,
>  };
>
>  static void
> @@ -390,6 +391,11 @@ void brw_init_state( struct brw_context *brw )
> brw->state.dirty.mesa = ~0;
> brw->state.dirty.brw = ~0ull;
>
> +   /* ~0 is a nonsensical value which won't match anything we program, so
> +* the programming will take effect on the first time around.
> +*/
> +   brw->state.pma_stall_bits = ~0;
> +
> /* Make sure that brw->state.dirty.brw has enough bits to hold all 
> possible
>  * dirty flags.
>  */
> diff --git a/src/mesa/drivers/dri/i965/gen8_depth_state.c 
> b/src/mesa/drivers/dri/i965/gen8_depth_state.c
> index 7c3bfe0..4284a62 100644
> --- a/src/mesa/drivers/dri/i965/gen8_depth_state.c
> +++ b/src/mesa/drivers/dri/i965/gen8_depth_state.c
> @@ -28,6 +28,7 @@
>  #include "brw_context.h"
>  #include "brw_state.h"
>  #include "brw_defines.h"
> +#include "brw_wm.h"
>
>  /**
>   * Helper function to emit depth related command packets.
> @@ -210,6 +211,172 @@ gen8_emit_depth_stencil_hiz(struct brw_context *brw,
>  }
>
>  /**
> + * Should we set the PMA FIX ENABLE bit?
> + *
> + * To avoid unnecessary depth related stalls, we need to set this bit.
> + * However, there is a very complicated formula which governs when it
> + * is legal to do so.  This function computes that.
> + *
> + * See the documenation for the CACHE_MODE_1 register, bit 11.
> + */
> +static bool
> +pma_fix_enable(const struct brw_context *brw)
> +{
> +   const struct gl_context *ctx = &brw->ctx;
> +   /* BRW_NEW_FRAGMENT_PROGRAM */
> +   const struct gl_fragment_program *fp = brw->fragment_program;
> +   /* _NEW_BUFFERS */
> +   struct intel_renderbuffer *depth_irb =
> +  intel_get_renderbuffer(ctx->DrawBuffer, BUFFER_DEPTH);
> +
> +   /* 3DSTATE_WM::ForceThreadDispatch is never used. */
> +   const bool wm_force_thread_dispatch = false;
> +
> +   /* 3DSTATE_RASTER::ForceSampleCount is never used. */
> +   const bool raster_force_sample_count_nonzero = false;
> +
> +   /* _NEW_BUFFERS:
> +* 3DSTATE_DEPTH_BUFFER::SURFACE_TYPE != NULL &&
> +* 3DSTATE_DEPTH_BUFFER::HIZ Enable
> +*/
> +   const bool hiz_enabled = depth_irb && 
> intel_renderbuffer_has_hiz(depth_irb);
> +
> +   /* 3DSTATE_WM::Early Depth/Stencil Control != EDSC_PREPS (2).
> +* We always leave this set to EDSC_NORMAL (0).
> +*/
> + 

Re: [Mesa-dev] [PATCH 3/3] i965: Re-enable Z16 on Gen8+.

2014-11-03 Thread Kristian Høgsberg
On Wed, Oct 22, 2014 at 8:58 AM, Kenneth Graunke  wrote:
> Improves performance in GLBenchmark 2.7 TRex by 3.9% +/- 0.336383%
> (n=80) at 1280x720 on Broadwell GT3.  Together with the previous patch,
> it improves performance by 5.42738% +/- 0.541971% (n=10) at 1920x1080.
>
> Note that without the PMA stall fix, this would instead decrease
> performance by 22%.
>
> Signed-off-by: Kenneth Graunke 
> ---
>  src/mesa/drivers/dri/i965/brw_surface_formats.c | 4 
>  1 file changed, 4 insertions(+)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_surface_formats.c 
> b/src/mesa/drivers/dri/i965/brw_surface_formats.c
> index 5407ef6..ce6f88c 100644
> --- a/src/mesa/drivers/dri/i965/brw_surface_formats.c
> +++ b/src/mesa/drivers/dri/i965/brw_surface_formats.c
> @@ -619,6 +619,8 @@ brw_init_surface_formats(struct brw_context *brw)
> brw->format_supported_as_render_target[MESA_FORMAT_S_UINT8] = true;
> brw->format_supported_as_render_target[MESA_FORMAT_Z_FLOAT32] = true;
> brw->format_supported_as_render_target[MESA_FORMAT_Z32_FLOAT_S8X24_UINT] 
> = true;
> +   if (brw->gen >= 8)
> +  brw->format_supported_as_render_target[MESA_FORMAT_Z_UNORM16] = true;
>
> /* We remap depth formats to a supported texturing format in
>  * translate_tex_format().
> @@ -639,6 +641,8 @@ brw_init_surface_formats(struct brw_context *brw)
>  * Other speculation is that we may be hitting increased fragment shader
>  * execution from GL_LEQUAL/GL_EQUAL depth tests at reduced precision.
>  */
> +   if (brw->gen >= 8)
> +  ctx->TextureFormatSupported[MESA_FORMAT_Z_UNORM16] = true;

Edit the comment above to mention that we get the expected performance
with the pma stall workaround?

Either way,

Reviewed-by: Kristian Høgsberg 

> /* On hardware that lacks support for ETC1, we map ETC1 to RGBX
>  * during glCompressedTexImage2D(). See intel_mipmap_tree::wraps_etc1.
> --
> 2.1.2
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev