Ben Widawsky writes:
>> + case GL_LUMINANCE:
>> + case GL_LUMINANCE_ALPHA:
>> + override_color.ui[1] = override_color.ui[0];
>> + override_color.ui[2] = override_color.ui[0];
>> + break;
>
> The definition for GL_LUMINANCE afaict: "Each element is a single
> luminance value. Th
From: Timothy Arceri
We now also only apply these rules to variables rather than also
trying to apply them to function params.
V2: move code for handling stream layout qualifier
---
src/glsl/ast_to_hir.cpp | 414 +---
1 file changed, 212 insertions(+)
This series adds support for compile time constants and also adds
subroutine index qualifier support which was missing for
ARB_explicit_uniform_location.
This series applies on top of a clean-up series[3]
V3:
- Some refactoring and a bug fix based on Emil's feedback on V2.
- Series overhauled t
From: Timothy Arceri
Use new helper that will in a later patch allow for
compile time constants.
---
src/glsl/ast_to_hir.cpp | 23 ---
1 file changed, 12 insertions(+), 11 deletions(-)
diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp
index 773b8ae..e7e2a85 1006
From: Timothy Arceri
This change moves the binding layout handing code into an apply
function to be consistent with other helper functions in the ast
code, and to encapsulate the code so that when we introduce
compile time constants the code will be much cleaner.
One small downside is for unname
From: Timothy Arceri
We are moving this out of the parser in preparation for compile
time constant support.
---
src/glsl/ast_to_hir.cpp | 22 ++
src/glsl/glsl_parser.yy | 8 +---
2 files changed, 15 insertions(+), 15 deletions(-)
diff --git a/src/glsl/ast_to_hir.cpp b/s
From: Timothy Arceri
Change name from validate -> apply to more accurately describe what
the function does.
---
src/glsl/ast_to_hir.cpp | 10 +-
1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp
index 7a05176..06ba97c 100644
-
From: Timothy Arceri
For now this just validates that a qualifier is inside its
minimum boundary, in a later patch we will expand it to
evaluate compile time constants.
---
src/glsl/ast_to_hir.cpp | 17 +
1 file changed, 17 insertions(+)
diff --git a/src/glsl/ast_to_hir.cpp b/sr
From: Timothy Arceri
Use new helper that will in a later patch allow for
compile time constants.
---
src/glsl/ast_to_hir.cpp | 9 ++---
1 file changed, 6 insertions(+), 3 deletions(-)
diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp
index de13589..8705f6e 100644
--- a/src/glsl
From: Timothy Arceri
The minimum value for index is validated in the ast code and
we want to remove validation from the parser so we can add
compile time constant support.
---
src/glsl/glsl_parser.yy | 8 +---
1 file changed, 1 insertion(+), 7 deletions(-)
diff --git a/src/glsl/glsl_parser.
From: Timothy Arceri
This patch replaces the old interger constant qualifiers with either
the new ast_layout_expression type if the qualifier requires merging
or ast_expression if the qualifier can't have mulitple declarations
or if all but the newest qualifier is simply ignored.
We also update
From: Timothy Arceri
---
docs/GL3.txt | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/docs/GL3.txt b/docs/GL3.txt
index b768eea..ad6b95e 100644
--- a/docs/GL3.txt
+++ b/docs/GL3.txt
@@ -179,7 +179,7 @@ GL 4.4, GLSL 4.40:
GL_ARB_buffer_storage
From: Timothy Arceri
This validation is moved later so we can validate the
max value when compile time constant support is added in a
later patch.
---
src/glsl/ast_to_hir.cpp | 22 --
src/glsl/ast_type.cpp | 14 --
2 files changed, 20 insertions(+), 16 deletions
From: Timothy Arceri
We are moving this out of the parser in preparation for compile
time constant support.
The reason a validation function is used rather than an apply
function like what is used with bindings is because glsl allows
streams to be defined on members of blocks even though they mu
From: Timothy Arceri
ARB_explicit_uniform_location allows the index for subroutine functions
to be explicitly set in the shader.
This patch reduces the restriction on the index qualifier in
validate_layout_qualifiers() to allow it to be applied to subroutines
and adds the new subroutine qualifie
From: Timothy Arceri
This will allow us to add error checking to this function
in a later patch, if we don't move it the error messages
will go missing.
---
src/glsl/glsl_parser_extras.cpp | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/src/glsl/glsl_parser_extras.cpp b
From: Timothy Arceri
In this patch we introduce a new ast type for holding the new
compile-time constant expressions. The main reason for this is that
we can no longer do merging of layout qualifiers before they have been
converted into GLSL IR so we need to store them to be proccessed later.
Th
On Sun, 2015-11-15 at 00:42 +1100, Timothy Arceri wrote:
> From: Timothy Arceri
>
> This validation is moved later so we can validate the
> max value when compile time constant support is added in a
> later patch.
> ---
> src/glsl/ast_to_hir.cpp | 22 --
> src/glsl/ast_type.c
Hi Emil,
I checked with Chih-Wei and LOCAL_EXPORT_C_INCLUDE_DIRS is the preferred
way.
I took the chance to finally understand why the single line of change you
proposed will suffice,
it's because glsl_compiler module invokes libmesa_glsl as static library
and the same applies to i965_dri module.
On Thu, Nov 12, 2015 at 7:30 AM, Iago Toral wrote:
> On Thu, 2015-11-12 at 16:23 +0100, Iago Toral wrote:
>> Patches 1-4 are,
>> Reviewed-by: Iago Toral Quiroga
>>
>> Patch 5 seems to be missing.
If it helps to calm reviewer's minds, I ran patches 1-5 with this patch on top:
http://cgit.freedes
I think it would be better if we kept track of the type of the
constant instead. That would also allow us to simplify the constant
construction code in, err, something else...
On Wed, Nov 11, 2015 at 8:23 PM, Jason Ekstrand wrote:
> From: Rob Clark
>
> This will simplify things somewhat in clone
On Sat, Nov 14, 2015 at 8:19 AM, Connor Abbott wrote:
> I think it would be better if we kept track of the type of the
> constant instead. That would also allow us to simplify the constant
> construction code in, err, something else...
We do keep track of the type. It's in the variable. Do you
well, clone just needs to know the number of elements, so this is the
simplest possible solution.. not against tracking the type as well, if
that is needed elsewhere.. or if there was a helper to map type to #
of elements, I suppose, but for now this makes clone possible.
BR,
-R
On Sat, Nov 14, 2
On Sat, Nov 14, 2015 at 11:01 AM, Jason Ekstrand wrote:
> On Thu, Nov 12, 2015 at 7:30 AM, Iago Toral wrote:
>> On Thu, 2015-11-12 at 16:23 +0100, Iago Toral wrote:
>>> Patches 1-4 are,
>>> Reviewed-by: Iago Toral Quiroga
>>>
>>> Patch 5 seems to be missing.
>
> If it helps to calm reviewer's mi
On Sat, Nov 14, 2015 at 11:55 AM, Jason Ekstrand wrote:
> On Sat, Nov 14, 2015 at 8:19 AM, Connor Abbott wrote:
>> I think it would be better if we kept track of the type of the
>> constant instead. That would also allow us to simplify the constant
>> construction code in, err, something else...
On Sat, Nov 14, 2015 at 11:55 AM, Rob Clark wrote:
> well, clone just needs to know the number of elements, so this is the
> simplest possible solution.. not against tracking the type as well, if
> that is needed elsewhere.. or if there was a helper to map type to #
> of elements, I suppose, but f
On Sat, Nov 14, 2015 at 8:58 AM, Rob Clark wrote:
> On Sat, Nov 14, 2015 at 11:01 AM, Jason Ekstrand wrote:
>> On Thu, Nov 12, 2015 at 7:30 AM, Iago Toral wrote:
>>> On Thu, 2015-11-12 at 16:23 +0100, Iago Toral wrote:
Patches 1-4 are,
Reviewed-by: Iago Toral Quiroga
Patch 5
On Sat, Nov 14, 2015 at 9:25 AM, Connor Abbott wrote:
> On Sat, Nov 14, 2015 at 11:55 AM, Rob Clark wrote:
>> well, clone just needs to know the number of elements, so this is the
>> simplest possible solution.. not against tracking the type as well, if
>> that is needed elsewhere.. or if there w
On Sat, Nov 14, 2015 at 9:44 AM, Rob Clark wrote:
> On Sat, Nov 14, 2015 at 12:30 PM, Jason Ekstrand wrote:
>> On Sat, Nov 14, 2015 at 8:58 AM, Rob Clark wrote:
>>> On Sat, Nov 14, 2015 at 11:01 AM, Jason Ekstrand
>>> wrote:
On Thu, Nov 12, 2015 at 7:30 AM, Iago Toral wrote:
> On Thu
On Sat, Nov 14, 2015 at 9:30 AM, Jason Ekstrand wrote:
> On Sat, Nov 14, 2015 at 9:25 AM, Connor Abbott wrote:
>> On Sat, Nov 14, 2015 at 11:55 AM, Rob Clark wrote:
>>> well, clone just needs to know the number of elements, so this is the
>>> simplest possible solution.. not against tracking the
On Fri, Nov 13, 2015 at 6:50 PM, Kenneth Graunke wrote:
> The geometry and tessellation control shader stages both read from
> multiple URB entries (one per vertex). The thread payload contains
> several URB handles which reference these separate memory segments.
>
> In GLSL, these inputs are rep
Currently only one metric is exposed but more will be added later.
Signed-off-by: Samuel Pitoiset
Tested-by: Pierre Moreau
---
src/gallium/drivers/nouveau/Makefile.sources | 2 +
src/gallium/drivers/nouveau/nv50/nv50_query_hw.c | 19 +-
.../drivers/nouveau/nv50/nv50_query_hw_metric.c
These compute-related MP performance counters have been reverse
engineered using CUPTI which is part of NVIDIA CUDA.
As for nvc0, we use a compute kernel to read out those performance
counters, and the command stream to configure them. Note that Tesla
only exposes 4 MP performance counters, while
Hi,
Only patch 1/3 has been updated. Patches 4 and 5 of the first version
have been dropped because those groups of GPU counters are going to
be removed.
Thanks.
Samuel Pitoiset (3):
nv50: implement a basic compute support
nv50: add compute-related MP perf counters on G84+
nv50: add suppor
This adds the ability to launch simple compute kernels like the one I
will use to read out MP performance counters in the upcoming patch.
This compute support is based on the work of Francisco Jerez (aka curro)
that he did as part of his EVoC project in 2011/2012 to get OpenCL
working on Tesla. Hi
From: Francisco Jerez
This will make sure that we recalculate the URB layout anytime the URB
size is modified by the L3 partitioning code.
---
src/mesa/drivers/dri/i965/brw_context.h | 2 ++
src/mesa/drivers/dri/i965/brw_state_upload.c | 1 +
src/mesa/drivers/dri/i965/gen7_urb.c | 3
From: Francisco Jerez
---
src/mesa/drivers/dri/i965/intel_reg.h | 53 +++
1 file changed, 53 insertions(+)
diff --git a/src/mesa/drivers/dri/i965/intel_reg.h
b/src/mesa/drivers/dri/i965/intel_reg.h
index a261c2b..0b167d5 100644
--- a/src/mesa/drivers/dri/i965/in
From: Francisco Jerez
This stores the result of can_do_pipelined_register_writes() in the
context struct so we can find out later whether LRI can be used to
program the L3 configuration.
---
src/mesa/drivers/dri/i965/brw_context.h | 5 +
src/mesa/drivers/dri/i965/intel_extensions.c | 8
From: Francisco Jerez
The input of the L3 set-up code is a vector giving the approximate
desired relative size of each partition. This implements logic to
compare the input vector against the table of validated configurations
for the device and pick the closest compatible one.
---
src/mesa/driv
From: Francisco Jerez
---
src/mesa/drivers/dri/i965/gen7_l3_state.c | 17 +
src/mesa/drivers/dri/i965/intel_debug.c | 1 +
src/mesa/drivers/dri/i965/intel_debug.h | 1 +
3 files changed, 19 insertions(+)
diff --git a/src/mesa/drivers/dri/i965/gen7_l3_state.c
b/src/mesa/dr
git://people.freedesktop.org/~jljusten/mesa cs-shared-variables-v1
http://patchwork.freedesktop.org/bundle/jljusten/cs-shared-variables-v1
Patches 1 - 13:
* Rebased curro's "i965: L3 cache partitioning." (sent Sept 6)
Patches 14 - 19:
* Rework lower_ubo_reference to allow code sharing with
From: Francisco Jerez
This calculates a rather conservative partitioning of the L3 cache
based on the shaders currently bound to the pipeline and whether they
use SLM, atomics, images or scratch space. The result is intended to
be fine-tuned later on based on other pipeline state.
---
src/mesa/
From: Francisco Jerez
This is going to require some rather intrusive kernel changes to fix
properly, in the meantime (and forever on at least pre-v4.1 kernels)
we'll have to restore the hardware defaults at the end of every batch
in which the L3 configuration was changed to avoid interfering with
This allows the code in emit_access to be generic enough to also be
for lowering shared variables.
Signed-off-by: Jordan Justen
Cc: Samuel Iglesias Gonsalvez
Cc: Iago Toral Quiroga
---
src/glsl/lower_ubo_reference.cpp | 78 ++--
1 file changed, 43 insertions
Signed-off-by: Jordan Justen
---
Notes:
I have ported this commit to shared variable stores:
commit 0cb7d7b4b7c32246d4c4225a1d17d7ff79a7526d
Author: Kristian Høgsberg Kristensen
Date: Wed Oct 21 23:43:34 2015 -0700
i965/fs: Optimize ssbo stores
It is
When an atomic function is called, we need to check to see if it is
for an SSBO variable before lowering it to the SSBO specific intrinsic
function.
Signed-off-by: Jordan Justen
Cc: Samuel Iglesias Gonsalvez
Cc: Iago Toral Quiroga
---
src/glsl/lower_ubo_reference.cpp | 14 ++
1 fil
From: Francisco Jerez
According to the hardware docs a DC flush is sufficient to make
CS_STALL happy, there's no need to add STALL_AT_SCOREBOARD whenever
it's present.
---
src/mesa/drivers/dri/i965/brw_pipe_control.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/src/mesa
This class has code that will be shared by lower_ubo_reference and
lower_shared_reference. (lower_shared_reference will be used to
support compute shader shared variables.)
Signed-off-by: Jordan Justen
Cc: Samuel Iglesias Gonsalvez
Cc: Iago Toral Quiroga
---
src/glsl/Makefile.sources|
From: Francisco Jerez
Improves performance of the arb_shader_image_load_store-atomicity
piglit test by over 25x (which isn't a real benchmark it's just heavy
on atomics -- the improvement in a microbenchmark I wrote a while ago
seemed to be even greater). The drawback is one needs to be
extra-ca
This code will also be usable by the pass to lower shared variables.
Note, that *const_offset is adjusted by setup_buffer_access so it must
be initialized before calling setup_buffer_access.
Signed-off-by: Jordan Justen
Cc: Samuel Iglesias Gonsalvez
Cc: Iago Toral Quiroga
---
src/glsl/lower_b
Signed-off-by: Jordan Justen
---
src/glsl/nir/glsl_to_nir.cpp | 53 +++
src/glsl/nir/nir_intrinsics.h | 25
2 files changed, 78 insertions(+)
diff --git a/src/glsl/nir/glsl_to_nir.cpp b/src/glsl/nir/glsl_to_nir.cpp
index 231870d..d787
Signed-off-by: Jordan Justen
---
src/mesa/drivers/dri/i965/brw_defines.h | 2 ++
src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 33
2 files changed, 35 insertions(+)
diff --git a/src/mesa/drivers/dri/i965/brw_defines.h
b/src/mesa/drivers/dri/i965/brw_defines.h
ind
The atomic functions can also be used with shared variables in compute
shaders.
When lowering the intrinsic in lower_ubo_reference, we still create an
SSBO specific intrinsic since SSBO accesses can be indirectly
addressed, whereas all compute shader shared variable live in a single
shared variabl
Signed-off-by: Jordan Justen
---
src/mesa/drivers/dri/i965/brw_cs.c| 2 ++
src/mesa/drivers/dri/i965/brw_defines.h | 2 ++
src/mesa/drivers/dri/i965/gen7_cs_state.c | 12
3 files changed, 16 insertions(+)
diff --git a/src/mesa/drivers/dri/i965/brw_cs.c
b/src/mesa/driver
From: Francisco Jerez
It should be possible to use additional L3 configurations other than
the ones listed in the tables of validated allocations ("BSpec »
3D-Media-GPGPU Engine » L3 Cache and URB [IVB+] » L3 Cache and URB [*]
» L3 Allocation and Programming"), but it seems sensible for now to
ha
In this lowering pass, shared variables are decomposed into intrinsic
calls.
Signed-off-by: Jordan Justen
---
src/glsl/Makefile.sources | 1 +
src/glsl/ir_optimization.h | 1 +
src/glsl/linker.cpp | 4 +
src/glsl/lower_shared_reference.cpp | 360 +
For compute shader shared variable we will set a default of column
major.
Signed-off-by: Jordan Justen
---
src/glsl/lower_buffer_access.cpp | 5 +++--
src/glsl/lower_buffer_access.h | 9 +
2 files changed, 12 insertions(+), 2 deletions(-)
diff --git a/src/glsl/lower_buffer_access.cpp
Signed-off-by: Jordan Justen
---
src/mesa/drivers/dri/i965/brw_fs_vector_splitting.cpp | 1 +
1 file changed, 1 insertion(+)
diff --git a/src/mesa/drivers/dri/i965/brw_fs_vector_splitting.cpp
b/src/mesa/drivers/dri/i965/brw_fs_vector_splitting.cpp
index cab5af3..2c7e0dc 100644
--- a/src/mesa/dr
Signed-off-by: Jordan Justen
---
src/glsl/nir/glsl_to_nir.cpp | 33 +
src/glsl/nir/nir_intrinsics.h | 3 ++-
2 files changed, 35 insertions(+), 1 deletion(-)
diff --git a/src/glsl/nir/glsl_to_nir.cpp b/src/glsl/nir/glsl_to_nir.cpp
index 67f1aed..cc1719a 100644
-
From: Francisco Jerez
---
src/mesa/drivers/dri/i965/brw_context.h | 4 ++--
src/mesa/drivers/dri/i965/brw_state_upload.c | 4
2 files changed, 6 insertions(+), 2 deletions(-)
diff --git a/src/mesa/drivers/dri/i965/brw_context.h
b/src/mesa/drivers/dri/i965/brw_context.h
index 618d785.
Signed-off-by: Jordan Justen
---
src/glsl/builtin_functions.cpp | 70 +++---
1 file changed, 38 insertions(+), 32 deletions(-)
diff --git a/src/glsl/builtin_functions.cpp b/src/glsl/builtin_functions.cpp
index 3e767e8..bd4c5a3 100644
--- a/src/glsl/builtin_fun
Signed-off-by: Jordan Justen
Cc: Samuel Iglesias Gonsalvez
Cc: Iago Toral Quiroga
---
src/glsl/lower_ubo_reference.cpp | 26 +-
1 file changed, 21 insertions(+), 5 deletions(-)
diff --git a/src/glsl/lower_ubo_reference.cpp b/src/glsl/lower_ubo_reference.cpp
index b74aa3
When an intrinsic atomic operation is used on a shared variable, we
translate it to a new 'share variable' specific intrinsic function
call.
For example, add call to __intrinsic_atomic_add when used on a shared
variable will be translated to a call to
__intrinsic_atomic_add_shared.
Signed-off-by:
Signed-off-by: Jordan Justen
---
src/mesa/drivers/dri/i965/brw_fs.h | 2 ++
src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 60
2 files changed, 62 insertions(+)
diff --git a/src/mesa/drivers/dri/i965/brw_fs.h
b/src/mesa/drivers/dri/i965/brw_fs.h
index f40e58b
Shared variables can be accessed by other threads within the same
local workgroup. This prevents us from performing certain
optimizations with shared variables.
Signed-off-by: Jordan Justen
---
src/glsl/opt_constant_propagation.cpp | 3 ++-
src/glsl/opt_constant_variable.cpp| 3 ++-
src/glsl
Signed-off-by: Jordan Justen
---
src/mesa/drivers/dri/i965/brw_shader.cpp | 3 +++
1 file changed, 3 insertions(+)
diff --git a/src/mesa/drivers/dri/i965/brw_shader.cpp
b/src/mesa/drivers/dri/i965/brw_shader.cpp
index a438e18..14c37a0 100644
--- a/src/mesa/drivers/dri/i965/brw_shader.cpp
+++ b/
Signed-off-by: Jordan Justen
---
src/glsl/ast_function.cpp | 18 ++
1 file changed, 10 insertions(+), 8 deletions(-)
diff --git a/src/glsl/ast_function.cpp b/src/glsl/ast_function.cpp
index 466ece6..da1167a 100644
--- a/src/glsl/ast_function.cpp
+++ b/src/glsl/ast_function.cpp
@@
From: Francisco Jerez
The L3 state atom calculates the target L3 partition weights when the
program bound to some shader stage is modified, and in case they are
far enough from the current partitioning it makes sure that the L3
state is re-emitted.
---
src/mesa/drivers/dri/i965/brw_context.h |
Signed-off-by: Jordan Justen
---
src/glsl/nir/glsl_to_nir.cpp | 29 +
src/glsl/nir/nir_intrinsics.h | 1 +
2 files changed, 30 insertions(+)
diff --git a/src/glsl/nir/glsl_to_nir.cpp b/src/glsl/nir/glsl_to_nir.cpp
index b10d192..67f1aed 100644
--- a/src/glsl/nir/gls
Signed-off-by: Jordan Justen
---
src/glsl/lower_variable_index_to_cond_assign.cpp | 3 +++
1 file changed, 3 insertions(+)
diff --git a/src/glsl/lower_variable_index_to_cond_assign.cpp
b/src/glsl/lower_variable_index_to_cond_assign.cpp
index 1ab3afe..a1ba934 100644
--- a/src/glsl/lower_variable
From: Francisco Jerez
---
src/mesa/drivers/dri/i965/gen7_l3_state.c | 95 +++
1 file changed, 95 insertions(+)
diff --git a/src/mesa/drivers/dri/i965/gen7_l3_state.c
b/src/mesa/drivers/dri/i965/gen7_l3_state.c
index 8f9ba5b..48bca29 100644
--- a/src/mesa/drivers/dri
Signed-off-by: Jordan Justen
Cc: Samuel Iglesias Gonsalvez
Cc: Iago Toral Quiroga
---
src/glsl/lower_buffer_access.cpp | 90
src/glsl/lower_buffer_access.h | 2 +
src/glsl/lower_ubo_reference.cpp | 90
3 files
As for nvc0, we need to free memory allocated by interpolation
parameters. This fixes a memory leak spotted by valgrind.
Signed-off-by: Samuel Pitoiset
---
src/gallium/drivers/nouveau/nv50/nv50_program.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/src/gallium/drivers/nouv
Reviewed-by: Ilia Mirkin
Thanks! I missed this in all the commotion of trying to get it
actually working :)
On Sat, Nov 14, 2015 at 5:00 PM, Samuel Pitoiset
wrote:
> As for nvc0, we need to free memory allocated by interpolation
> parameters. This fixes a memory leak spotted by valgrind.
>
> Si
https://bugs.freedesktop.org/show_bug.cgi?id=92954
Bug ID: 92954
Summary: [softpipe] piglit drawbuffer-modes regression
Product: Mesa
Version: git
Hardware: x86-64 (AMD64)
OS: Linux (All)
Status: NEW
Keywo
On 2015-11-14 13:43:38, Jordan Justen wrote:
> From: Francisco Jerez
>
> This stores the result of can_do_pipelined_register_writes() in the
> context struct so we can find out later whether LRI can be used to
> program the L3 configuration.
> ---
> src/mesa/drivers/dri/i965/brw_context.h |
Reviewed-by: Jordan Justen
On 2015-11-14 13:43:37, Jordan Justen wrote:
> From: Francisco Jerez
>
> ---
> src/mesa/drivers/dri/i965/intel_reg.h | 53
> +++
> 1 file changed, 53 insertions(+)
>
> diff --git a/src/mesa/drivers/dri/i965/intel_reg.h
> b/src/mesa/
---
src/glsl/nir/glsl_to_nir.cpp | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/src/glsl/nir/glsl_to_nir.cpp b/src/glsl/nir/glsl_to_nir.cpp
index b10d192..62eedbf 100644
--- a/src/glsl/nir/glsl_to_nir.cpp
+++ b/src/glsl/nir/glsl_to_nir.cpp
@@ -1538,9 +1538,9 @@ nir_visit
Whoops.
Reviewed-by: Connor Abbott
On Sat, Nov 14, 2015 at 8:49 PM, Matt Turner wrote:
> ---
> src/glsl/nir/glsl_to_nir.cpp | 6 +++---
> 1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/src/glsl/nir/glsl_to_nir.cpp b/src/glsl/nir/glsl_to_nir.cpp
> index b10d192..62eedbf 100644
Series is
Reviewed-by: Connor Abbott
Although I'm not as familiar now with the code touched in the last patch.
On Thu, Nov 12, 2015 at 3:13 PM, Jason Ekstrand wrote:
> The subject says it all. This little series adds texture swizzle support
> to nir_lower_tex and makes the i965 driver use tha
The equivalent of the last patch for the hash table. I'm not aware of
any issues this fixes.
Signed-off-by: Connor Abbott
---
src/util/hash_table.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/src/util/hash_table.c b/src/util/hash_table.c
index 3247593..466519f 100644
--
Not sure how this wasn't already caught by valgrind, but it fixes an
issue with the vectorizer.
Signed-off-by: Connor Abbott
---
src/util/set.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/src/util/set.c b/src/util/set.c
index f01f869..331ff58 100644
--- a/src/util/set.c
This series adds an optimization to detect things like:
foo.x = bar.x + baz.x;
foo.y = bar.y + baz.y;
foo.z = bar.z + baz.z;
foo.w = bar.w + baz.w;
and turn them into:
foo = bar + baz;
which shows up distressingly often in shaders translated from D3D
bytecode, or by people who seemingly don't k
Signed-off-by: Connor Abbott
---
src/glsl/nir/nir_array.h | 21 +
1 file changed, 21 insertions(+)
diff --git a/src/glsl/nir/nir_array.h b/src/glsl/nir/nir_array.h
index 1db4e8c..d704119 100644
--- a/src/glsl/nir/nir_array.h
+++ b/src/glsl/nir/nir_array.h
@@ -84,13 +84,34 @@
This effectively does the opposite of nir_lower_alus_to_scalar, trying
to combine per-component ALU operations with the same sources but
different swizzles into one larger ALU operation. It uses a similar
model as CSE, where we do a depth-first approach and keep around a hash
set of instructions to
Shader-db results on bdw with INTEL_DEBUG=vec4:
total instructions in shared programs: 1634044 -> 1612936 (-1.29%)
instructions in affected programs: 802502 -> 781394 (-2.63%)
helped: 5036
HURT: 1442
total cycles in shared programs: 9397790 -> 9355382 (-0.45%)
cycles in affected programs: 5078600
On Nov 14, 2015 6:40 PM, "Connor Abbott" wrote:
>
> Series is
>
> Reviewed-by: Connor Abbott
Thanks! I'll make sure to have ken or matt take a quick look at the last
one.
Now, if only someone would review the nir_shader_clone patch...
> Although I'm not as familiar now with the code touched i
87 matches
Mail list logo