Jason Ekstrand writes:
> On Thu, May 26, 2016 at 8:46 PM, Francisco Jerez
> wrote:
>
>> Even though my plan was to send the remaining changes for SIMD32 as a
>> single last series, I'm feeling too sleep-deprived to finish cleaning
>> up the rest of the series today so I'll send them in another s
Jordan Justen writes:
> On 2016-05-26 20:46:29, Francisco Jerez wrote:
>> This should have the side effect of enabling the ARB_compute_shader
>> extension on Gen8+ hardware and all Gen7 platforms that didn't
>> previously expose it (VLV and IVB GT1) due to the number of hardware
>> threads per su
Hi,
On Fri, May 27, 2016 at 7:36 PM, Nicolas Boichat wrote:
> Hi Emil,
>
> Took us some time to clean things up, but we got an ebuild and repo to
> share with you.
>
> On Tue, May 24, 2016 at 10:52 PM, Emil Velikov
> wrote:
> [snip]
>>> We also set PKGCONFIG="false", because, well, we do not ha
---
src/mesa/drivers/dri/i965/brw_ir_fs.h | 20 +---
1 file changed, 17 insertions(+), 3 deletions(-)
diff --git a/src/mesa/drivers/dri/i965/brw_ir_fs.h
b/src/mesa/drivers/dri/i965/brw_ir_fs.h
index 73c6327..c42bd34 100644
--- a/src/mesa/drivers/dri/i965/brw_ir_fs.h
+++ b/src/mes
ARB_compute_shader was apparently the last feature missing.
---
src/mesa/drivers/dri/i965/intel_extensions.c | 2 +-
src/mesa/drivers/dri/i965/intel_screen.c | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/src/mesa/drivers/dri/i965/intel_extensions.c
b/src/mesa/drivers/
This requires using a bitset instead of a boolean flag to keep track
of the GRFs we've seen a generating instruction for already. The
search loop continues until all instructions initializing the value of
the source VGRF have been found, or it is determined that coalescing
is not possible.
---
sr
We know that there cannot be any destination dependency race if we
reach the beginning or end of the program without having found any
other instruction the send could possibly race with. This avoids
emitting a pile of useless moves at the beginning or end of the
program in the most common case in
I haven't found any evidence that this isn't supported by the
hardware, in fact according to the SNB hardware spec:
"The supported regioning modes for math instructions are align16,
align1 with the following restrictions:
- Scalar source is supported.
[...]
- Source and destination offs
---
src/mesa/drivers/dri/i965/brw_fs.cpp | 60
1 file changed, 13 insertions(+), 47 deletions(-)
diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index 7dc7c5b..b1259da 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
This will prevent some shader-db regressions when we start plumbing
logical sends through the optimizer.
---
src/mesa/drivers/dri/i965/brw_fs_cse.cpp | 13 +
1 file changed, 13 insertions(+)
diff --git a/src/mesa/drivers/dri/i965/brw_fs_cse.cpp
b/src/mesa/drivers/dri/i965/brw_fs_cse.
---
src/mesa/drivers/dri/i965/brw_fs.cpp | 28 ++--
1 file changed, 14 insertions(+), 14 deletions(-)
diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index 5d26c68..7dc7c5b 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/sr
This will be useful in several places. The only externally visible
difference (other than non-VGRF files being supported now) is that the
region sizes are now passed in byte units instead of in GRF units
because the loss of precision would have become a problem in the SIMD
lowering pass.
---
.../
Compute-to-mrf was checking whether the destination of scan_inst is
more than one component (making assumptions about the instruction data
type) in order to find out whether the result is being fully copied
into the MRF destination, which is rather inaccurate in cases where a
single-component instr
---
src/mesa/drivers/dri/i965/brw_fs.cpp | 16
1 file changed, 8 insertions(+), 8 deletions(-)
diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index 50552cb..660a8db 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers
Skipping the temporary allocation and copy instructions is easy (just
return dst), but the conditions used to find out whether the copy can
be optimized out safely without breaking the program are rather
complex: The destination must be exactly one component of at most the
execution width of the lo
The sampler EOT optimization pass naively assumes that the texturing
instruction provides all the data used by the FB write just because
they're standing next to each other. The least we should be checking
is whether the source and destination regions of the FB write and
texturing instructions mat
---
src/mesa/drivers/dri/i965/brw_shader.cpp | 1 +
1 file changed, 1 insertion(+)
diff --git a/src/mesa/drivers/dri/i965/brw_shader.cpp
b/src/mesa/drivers/dri/i965/brw_shader.cpp
index 7863003..9199ecd 100644
--- a/src/mesa/drivers/dri/i965/brw_shader.cpp
+++ b/src/mesa/drivers/dri/i965/brw_sha
Just to make sure we keep the SIMD lowering pass tidy when we
introduce additional logic to try to optimize out the copy
instructions used to zip and unzip the destination and source regions
into multiple packed regions of the lowered instruction width.
Shouldn't cause any functional changes.
---
This is the case for SNB math instructions so we need to be careful
and insert the literal value of the immediate into the table (rather
than its absolute value) if the instruction is unable to invert the
sign of the constant on the fly.
---
src/mesa/drivers/dri/i965/brw_fs_combine_constants.cpp |
This will be required to correctly transform the destination of 8-wide
instructions that write a single GRF of a VGRF to MRF copy marked
COMPR4.
---
src/mesa/drivers/dri/i965/brw_fs.cpp | 24 +++-
1 file changed, 23 insertions(+), 1 deletion(-)
diff --git a/src/mesa/drivers/dr
Logical sends are eventually lowered into a series of copies so they
can take almost anything as source.
---
.../drivers/dri/i965/brw_fs_copy_propagation.cpp | 34 ++
1 file changed, 34 insertions(+)
diff --git a/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp
b/src/m
There are two reasons why this is useful:
- It avoids the introduction of an amount of partial writes emitted
by the SIMD lowering pass to zip and unzip register regions early
during optimization, which can make subsequent optimization less
effective.
- It substantially reduces the bur
This will be useful in the SIMD lowering pass to avoid having to
construct a builder object of the known region width just to pass it
as argument to offset(), which doesn't do anything with it other than
taking the builder dispatch_width as region width.
---
src/mesa/drivers/dri/i965/brw_fs.h|
---
src/mesa/drivers/dri/i965/brw_fs.cpp | 7 +++
1 file changed, 7 insertions(+)
diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index ca6368e..8eff27e 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -5
No shader-db regressions.
---
src/mesa/drivers/dri/i965/brw_fs.cpp | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index 20feb6f..3c468d8 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/m
This makes the whole LOAD_PAYLOAD munging unnecessary which simplifies
the code and will allow the optimization to succeed in more cases
independent of whether the LOAD_PAYLOAD instruction can be found or
not.
---
src/mesa/drivers/dri/i965/brw_fs.cpp | 79 +++-
1 fi
This will allow compute_to_mrf to handle cases where the source of the
VGRF-to-MRF copy is initialized by more than one instruction. In such
cases we cannot rewrite the destination of any of the generating
instructions until it's known whether the whole VGRF source region can
be coalesced into the
---
src/mesa/drivers/dri/i965/brw_fs.cpp | 15 +--
1 file changed, 9 insertions(+), 6 deletions(-)
diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index 3c468d8..45c4753 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/
This fixes the few code quality regressions from the previous series
enabling SIMD32 CS codegen in the back-end -- AFAICT by the end of the
series we can finally enable GL 4.3 on all Gen8+ hardware.
Patches 1-8 delay the SIMD lowering pass after the bulk of
optimization passes have been run, which
---
src/compiler/glsl/ast_to_hir.cpp | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/src/compiler/glsl/ast_to_hir.cpp b/src/compiler/glsl/ast_to_hir.cpp
index a2eb32d..7c5c4e5 100644
--- a/src/compiler/glsl/ast_to_hir.cpp
+++ b/src/compiler/glsl/ast_to_hir.cpp
@@ -2946,8 +294
This stops the offset being bumped again when and an explicit
alignment has already been applied.
Fixes alignment issues in:
GL44-CTS.enhanced_layouts.uniform_block_alignment
Note the test still fails due to unrelated issues with doubles.
---
src/compiler/glsl/ast_to_hir.cpp | 4 +---
1 file cha
On 19 May 2016 at 23:17, Rhys Kidd wrote:
> Correct use of qir_dump_inst() within QIR validate pass.
>
> Reported by the following GCC warning:
>
> mesa/src/gallium/drivers/vc4/vc4_qir_validate.c: In function 'fail_instr':
> mesa/src/gallium/drivers/vc4/vc4_qir_validate.c:31:23: warning: passing
https://bugs.freedesktop.org/show_bug.cgi?id=96254
Bug ID: 96254
Summary: [softpipe] piglit unsized-array-not-in-last-position
regression
Product: Mesa
Version: git
Hardware: x86-64 (AMD64)
OS: Linux (All)
On 2016-05-26 20:46:29, Francisco Jerez wrote:
> This should have the side effect of enabling the ARB_compute_shader
> extension on Gen8+ hardware and all Gen7 platforms that didn't
> previously expose it (VLV and IVB GT1) due to the number of hardware
> threads per subslice being insufficient in S
Emil Velikov wrote:
On 27 May 2016 at 15:40, Christian König wrote:
No, what I'm saying is that it is a number and not an enum.
This way you don't need to change the specification when you want to support
a new level.
That's the case indeed. Thanks for explaining.
That's handy, FWIW the r
FYI: I am planning to get to this. I've just been too busy with the branch
point and this hasn't seemed like something we need to get in by then.
I'll take a look on Monday.
--Jason
On Fri, May 27, 2016 at 1:30 AM, Kenneth Graunke
wrote:
> On Wednesday, May 25, 2016 7:08:35 PM PDT Topi Pohjolai
On Fri, May 27, 2016 at 2:46 PM, Jordan Justen
wrote:
> On 2016-05-27 14:23:39, Jason Ekstrand wrote:
> >On Fri, May 27, 2016 at 11:24 AM, Jordan Justen
> > wrote:
> >
> > This thread ID uniform will be used to compute the
> > gl_LocalInvocationIndex and gl_LocalInvocationID val
On 27 May 2016 at 21:11, Jason Ekstrand wrote:
> On Fri, May 27, 2016 at 11:49 AM, ⚛ <0xe2.0x9a.0...@gmail.com> wrote:
>
>> Hello.
>>
>> http://en.cppreference.com/w/cpp/thread/future
>> http://en.cppreference.com/w/cpp/thread/async
>>
>> Assumption: Shader compilation will need run on separate t
https://bugs.freedesktop.org/show_bug.cgi?id=96247
Vinson Lee changed:
What|Removed |Added
Status|NEW |RESOLVED
Resolution|---
On Fri, May 27, 2016 at 1:16 PM, Jason Ekstrand
wrote:
>
>
> On Fri, May 27, 2016 at 1:10 PM, Francisco Jerez
> wrote:
>
>> Jason Ekstrand writes:
>>
>> > On Thu, May 26, 2016 at 8:46 PM, Francisco Jerez > >
>> > wrote:
>> >
>> >> ---
>> >> src/mesa/drivers/dri/i965/brw_fs.cpp | 26 +++
Am 27.05.2016 um 23:57 schrieb Brian Paul:
> gcc didn't warn about the unsigned / enum pipe_prim_type mismatch
> between the .c and .h file.
> ---
> src/gallium/auxiliary/util/u_debug.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/src/gallium/auxiliary/util/u_debug.c
gcc didn't warn about the unsigned / enum pipe_prim_type mismatch
between the .c and .h file.
---
src/gallium/auxiliary/util/u_debug.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/src/gallium/auxiliary/util/u_debug.c
b/src/gallium/auxiliary/util/u_debug.c
index 0d63cfe..3a9
On 2016-05-27 14:23:39, Jason Ekstrand wrote:
>On Fri, May 27, 2016 at 11:24 AM, Jordan Justen
> wrote:
>
> This thread ID uniform will be used to compute the
> gl_LocalInvocationIndex and gl_LocalInvocationID values.
>
> It is important for this uniform to be added in the
1-7 are
Reviewed-by: Matt Turner
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev
On 2016-05-27 14:05:29, Jason Ekstrand wrote:
>On Fri, May 27, 2016 at 11:24 AM, Jordan Justen
> wrote:
>
> v2:
> * simd16/32 fixes (curro)
>
> Signed-off-by: Jordan Justen
> ---
> src/compiler/nir/nir_intrinsics.h| 1 +
> src/mesa/drivers/dri/i96
On Fri, May 27, 2016 at 11:24 AM, Jordan Justen
wrote:
> This thread ID uniform will be used to compute the
> gl_LocalInvocationIndex and gl_LocalInvocationID values.
>
> It is important for this uniform to be added in the last push constant
> register. fs_visitor::assign_constant_locations is up
On Fri, May 27, 2016 at 1:31 PM, Jason Ekstrand
wrote:
>
>
> On Fri, May 27, 2016 at 11:24 AM, Jordan Justen > wrote:
>
>> This will be important when we start adding a uniform for the CS
>> thread local invocation index.
>>
>> Signed-off-by: Jordan Justen
>> ---
>> src/intel/vulkan/anv_pipeli
On Fri, May 27, 2016 at 11:24 AM, Jordan Justen
wrote:
> v2:
> * simd16/32 fixes (curro)
>
> Signed-off-by: Jordan Justen
> ---
> src/compiler/nir/nir_intrinsics.h| 1 +
> src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 15 +++
> 2 files changed, 16 insertions(+)
>
> diff --git
https://bugs.freedesktop.org/show_bug.cgi?id=95005
--- Comment #15 from Bas Nieuwenhuizen ---
A radeonsi GL4.3 patch was just pushed.
As far as Shawn's comment goes I think he's referring to the ARK games, which
still fail. Those also have a separate bug at
https://bugs.freedesktop.org/show_bug.
https://bugs.freedesktop.org/show_bug.cgi?id=95005
--- Comment #14 from Alexandre Demers ---
Since both patches were pushed (and the piglit test fixed) and since nouveau
driver is now enabling OpenGL 4.3 by default (for nvc0), what's preventing us
from exposing OpenGL 4.3 for radeonsi also? Both
On Thu, May 26, 2016 at 8:06 PM, Jason Ekstrand wrote:
>
> On May 26, 2016 7:06 PM, "Ian Romanick" wrote:
>>
>> On 05/26/2016 06:30 PM, Jason Ekstrand wrote:
>> > This shrinks the .text section of nir_opt_algebraic.o by 30.5 KB:
>> >
>> >text data bss dec hex filename
>> >
On 2016-05-27 13:18:54, Jason Ekstrand wrote:
>On Fri, May 27, 2016 at 11:24 AM, Jordan Justen
> wrote:
>
> diff --git a/src/mesa/state_tracker/st_extensions.c
> b/src/mesa/state_tracker/st_extensions.c
> index 68e6601..8f249bb 100644
> --- a/src/mesa/state_tracker/st_e
On Fri, May 27, 2016 at 11:24 AM, Jordan Justen
wrote:
> This will be important when we start adding a uniform for the CS
> thread local invocation index.
>
> Signed-off-by: Jordan Justen
> ---
> src/intel/vulkan/anv_pipeline.c | 32 +++-
> 1 file changed, 31 inserti
On 28 May 2016 at 06:11, Marek Olšák wrote:
> From: Marek Olšák
>
Reviewed-by: Dave Airlie
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev
2-5 Reviewed-by: Jordan Justen
On 2016-05-20 12:41:08, Jason Ekstrand wrote:
> Also, we don't actually need it for clipping because meta always colors
> inside the lines and, for all other operations, the user is required to set
> a scissor. Since DRAWING_RECTANGLE stalls the GPU, we want to emi
On Fri, May 27, 2016 at 11:24 AM, Jordan Justen
wrote:
> v2:
> * Move lower flag to context constants. (Ken)
>
> Signed-off-by: Jordan Justen
> Reviewed-by: Kenneth Graunke (v1)
> ---
> src/compiler/glsl/builtin_variables.cpp | 29
> ++---
> src/compiler/glsl/glsl_par
On Fri, May 27, 2016 at 1:10 PM, Francisco Jerez
wrote:
> Jason Ekstrand writes:
>
> > On Thu, May 26, 2016 at 8:46 PM, Francisco Jerez
> > wrote:
> >
> >> ---
> >> src/mesa/drivers/dri/i965/brw_fs.cpp | 26 ++
> >> 1 file changed, 26 insertions(+)
> >>
> >> diff --git
From: Marek Olšák
From the OpenGL 4.5 core spec:
"An INVALID_VALUE error is generated if texture is not zero and level is
not a supported texture level for textarget, as described above."
Other FramebufferTexture functions already do the same thing.
This fixes the main menu in F1 2015.
Cc:
Jason Ekstrand writes:
> On Thu, May 26, 2016 at 8:46 PM, Francisco Jerez
> wrote:
>
>> ---
>> src/mesa/drivers/dri/i965/brw_fs.cpp | 26 ++
>> 1 file changed, 26 insertions(+)
>>
>> diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp
>> b/src/mesa/drivers/dri/i965/brw_fs.
On Fri, May 27, 2016 at 11:49 AM, ⚛ <0xe2.0x9a.0...@gmail.com> wrote:
> Hello.
>
> http://en.cppreference.com/w/cpp/thread/future
> http://en.cppreference.com/w/cpp/thread/async
>
> Assumption: Shader compilation will need run on separate thread(s).
>
> From a certain perspective, one of the easy
On Fri, May 27, 2016 at 3:56 PM, Marek Olšák wrote:
> From: Marek Olšák
>
> v2: don't use PFP_SYNC_ME on R700
All 6 patches look reasonable to me.
Reviewed-by: Alex Deucher
> ---
> src/gallium/drivers/r600/evergreen_hw_context.c | 13 +++--
> src/gallium/drivers/r600/evergreend.h
On Fri, May 27, 2016 at 1:01 PM, Jordan Justen
wrote:
> On 2016-05-20 12:41:04, Jason Ekstrand wrote:
> > Instead of blasting it out as part of the pipeline, we put it in the
> > command buffer and only blast it out when it's really needed. Since the
> > PUSH_CONSTANT_ALLOC commands aren't pipel
On 2016-05-20 12:41:04, Jason Ekstrand wrote:
> Instead of blasting it out as part of the pipeline, we put it in the
> command buffer and only blast it out when it's really needed. Since the
> PUSH_CONSTANT_ALLOC commands aren't pipelined, they immediately cause a
> stall which we would like to av
On Thu, May 26, 2016 at 8:46 PM, Francisco Jerez
wrote:
> Even though my plan was to send the remaining changes for SIMD32 as a
> single last series, I'm feeling too sleep-deprived to finish cleaning
> up the rest of the series today so I'll send them in another series
> tomorrow.
>
> The patches
On 28 May 2016 at 05:47, Marek Olšák wrote:
> From: Marek Olšák
>
> We'll sort out any issues in rc versions.
Not sure it needs reviewing,
but definitely
Acked-by: Dave Airlie
> ---
> src/gallium/drivers/radeonsi/si_pipe.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --g
I don't think this is needed if we make no8 and no16 work properly.
On Thu, May 26, 2016 at 8:46 PM, Francisco Jerez
wrote:
> The do32 INTEL_DEBUG option causes the back-end to try to generate a
> SIMD32 program when compiling a compute shader regardless of the
> specified compute shader workgro
From: Marek Olšák
v2: don't use PFP_SYNC_ME on R700
---
src/gallium/drivers/r600/evergreen_hw_context.c | 13 +++--
src/gallium/drivers/r600/evergreend.h | 1 +
src/gallium/drivers/r600/r600_blit.c| 6 --
src/gallium/drivers/r600/r600_hw_context.c | 25 ++
On Thu, May 26, 2016 at 8:46 PM, Francisco Jerez
wrote:
> ---
> src/mesa/drivers/dri/i965/brw_fs.cpp | 26 ++
> 1 file changed, 26 insertions(+)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp
> b/src/mesa/drivers/dri/i965/brw_fs.cpp
> index 1f3b23b..7002346 100644
On Fri, May 27, 2016 at 3:44 PM, Marek Olšák wrote:
> On Fri, May 27, 2016 at 9:03 PM, Alex Deucher wrote:
>> On Fri, May 27, 2016 at 2:18 PM, Marek Olšák wrote:
>>> From: Marek Olšák
>>>
>>> R600-R700 used a bad workaround. Now only R600 has to use it.
>>> ---
>>> src/gallium/drivers/r600/eve
From: Marek Olšák
We'll sort out any issues in rc versions.
---
src/gallium/drivers/radeonsi/si_pipe.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/src/gallium/drivers/radeonsi/si_pipe.c
b/src/gallium/drivers/radeonsi/si_pipe.c
index eefc68a..7288180 100644
--- a/src/gall
https://bugs.freedesktop.org/show_bug.cgi?id=96238
Franck Delache changed:
What|Removed |Added
CC||franck.delache@grassvalley.
On Fri, May 27, 2016 at 9:03 PM, Alex Deucher wrote:
> On Fri, May 27, 2016 at 2:18 PM, Marek Olšák wrote:
>> From: Marek Olšák
>>
>> R600-R700 used a bad workaround. Now only R600 has to use it.
>> ---
>> src/gallium/drivers/r600/evergreen_hw_context.c | 13 +++--
>> src/gallium/driver
On Thu, May 26, 2016 at 8:46 PM, Francisco Jerez
wrote:
> The Gen5+ sampler message payload construction code steps through the
> coordinate and derivative components by induction like 'coordinate =
> offset(coordinate, bld, 1)', the problem is that while doing that it
> may step one past the end
On Thu, May 26, 2016 at 8:46 PM, Francisco Jerez
wrote:
> The conditional mod of these instructions determines the semantics of
> the comparison itself (rather than being evaluated based on the result
> of the instruction as is usually the case for most other instructions
> that allow conditional
On Thu, May 26, 2016 at 8:46 PM, Francisco Jerez
wrote:
> This prevents false dependencies from being created between
> instructions that write disjoint 8-bit portions of the flag register
> and OTOH should make sure that the scheduler considers dependencies
> between instructions that write or r
On Thu, May 26, 2016 at 8:46 PM, Francisco Jerez
wrote:
> ---
> src/mesa/drivers/dri/i965/brw_fs.cpp | 54
> +--
> src/mesa/drivers/dri/i965/brw_ir_fs.h | 25 ++--
> 2 files changed, 68 insertions(+), 11 deletions(-)
>
> diff --git a/src/mesa/drivers/
On 27 May 2016 at 22:40, Bas Nieuwenhuizen wrote:
> Reviewed-by: Bas Nieuwenhuizen
Tested-by: Dave Airlie
Passes the CTS test now.
Dave.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev
On Fri, May 27, 2016 at 2:18 PM, Marek Olšák wrote:
> From: Marek Olšák
>
> R600-R700 used a bad workaround. Now only R600 has to use it.
> ---
> src/gallium/drivers/r600/evergreen_hw_context.c | 13 +++--
> src/gallium/drivers/r600/evergreend.h | 1 +
> src/gallium/drivers/r6
Hello.
http://en.cppreference.com/w/cpp/thread/future
http://en.cppreference.com/w/cpp/thread/async
Assumption: Shader compilation will need run on separate thread(s).
>From a certain perspective, one of the easy ways of removing Mesa shader
compilation from the "main" thread would be to use std
From: Marek Olšák
---
src/gallium/drivers/r600/evergreen_state.c | 25 +++--
src/gallium/drivers/r600/r600_hw_context.c | 4
src/gallium/drivers/r600/r600_state.c| 25 +++--
src/gallium/drivers/r600/r600_state_common.c | 3 ---
4 files c
From: Marek Olšák
The main impact is that fast color clear doesn't flush TC, CONST, DB.
---
src/gallium/drivers/r600/evergreen_hw_context.c | 18 +++---
src/gallium/drivers/r600/r600_blit.c| 2 +-
src/gallium/drivers/r600/r600_pipe.h| 20 +++-
From: Marek Olšák
The main impact is that {upload, draw, upload, draw, ..} doesn't flush
framebuffer caches before every upload.
---
src/gallium/drivers/r600/r600_hw_context.c | 15 +--
1 file changed, 1 insertion(+), 14 deletions(-)
diff --git a/src/gallium/drivers/r600/r600_hw_con
This can improve performance of GPU-bound tests that benefit from warm caches
from previous draw calls.
You can also get it from here:
git://people.freedesktop.org/~mareko/mesa r600-opt-flushes
Not tested with piglit. If somebody wants to run piglit with this, please let
me know which GPUs y
But it will be used in the future, when we need to support dynamic
formats, with OpenCL. I'd rather leave this in.
-ilia
On Fri, May 27, 2016 at 4:14 AM, Samuel Pitoiset
wrote:
> This codegen lib code is no longer used for Kepler since we convert
> the formats directly in the lowering pass.
>
Reviewed-by: Ilia Mirkin
On Fri, May 27, 2016 at 4:14 AM, Samuel Pitoiset
wrote:
> This code was used for validating surfaces with compute but now we use
> pipe_image_view instead. Anyway, surfaces support should be
> re-introduced properly once OpenCL happens.
>
> Signed-off-by: Samuel Pitoiset
The old method pushed data for each channels uvec3 data of
gl_LocalInvocationID.
The new method pushes 1 dword of data that is a 'thread local ID'
value. Based on that value, we can generate gl_LocalInvocationIndex
and gl_LocalInvocationID with some simple calculations.
Signed-off-by: Jordan Just
TBH I don't like this. The way it is now, there's an obvious
correlation between the numbers uploaded, and the for loops/etc which
actually stick the data into the pushbuf. After your change, it's not
at all clear, and should those numbers become disconnected it'll be
difficult to track down.
If y
Signed-off-by: Jordan Justen
---
src/intel/vulkan/anv_cmd_buffer.c | 52 --
src/intel/vulkan/anv_pipeline.c| 4 +++
src/intel/vulkan/anv_private.h | 1 -
src/intel/vulkan/gen7_cmd_buffer.c | 10 ++--
src/intel/vulkan/gen8_cmd_buffer.c | 13 --
This will be important when we start adding a uniform for the CS
thread local invocation index.
Signed-off-by: Jordan Justen
---
src/intel/vulkan/anv_pipeline.c | 32 +++-
1 file changed, 31 insertions(+), 1 deletion(-)
diff --git a/src/intel/vulkan/anv_pipeline.c b/
v2:
* simd16/32 fixes (curro)
Signed-off-by: Jordan Justen
---
src/compiler/nir/nir_intrinsics.h| 1 +
src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 15 +++
2 files changed, 16 insertions(+)
diff --git a/src/compiler/nir/nir_intrinsics.h
b/src/compiler/nir/nir_intrinsics.h
i
Signed-off-by: Jordan Justen
---
src/mesa/drivers/dri/i965/brw_compiler.h | 1 +
src/mesa/drivers/dri/i965/brw_fs.cpp | 13 +++--
src/mesa/drivers/dri/i965/gen7_cs_state.c | 32 ++-
3 files changed, 22 insertions(+), 24 deletions(-)
diff --git a/src/mes
The cross thread constant support appears on Haswell. It allows us to
upload a set of uniform data for all threads without duplicating it
per thread.
We also support per-thread data which allows us to store a per-thread
ID in one of the uniforms that can be used to calculate the
gl_LocalInvocation
This pass replaces the local id and local index intrinsics with i965
specific nir code.
It relies on the gl_i965_cs_thread_local_id uniform variable which
actually varies per thread to provide a thread local id.
Signed-off-by: Jordan Justen
---
src/mesa/drivers/dri/i965/brw_program.c | 1 +
1 f
This thread ID uniform will be used to compute the
gl_LocalInvocationIndex and gl_LocalInvocationID values.
It is important for this uniform to be added in the last push constant
register. fs_visitor::assign_constant_locations is updated to make
sure this happens.
The reason this is important is
We add a lowering pass for nir intrinsics. This pass can replace nir
intrinsics with driver specific nir lower code.
We lower the gl_LocalInvocationIndex intrinsic based on a uniform
which is loaded with a thread specific ID.
We also lower the gl_LocalInvocationID based on
gl_LocalInvocationIndex
git://people.freedesktop.org/~jljusten/mesa hsw-cs-cross-thread-constants-v2
v2:
* Add v1 feedback (as noted in patch commit messaged)
* Add vulkan support
Tested with curro's simd32 CS series. The IDs appear to be working
with simd32, and the UE4 elemental ran with INTEL_DEBUG=do32. (Tested
on
v2:
* Move lower flag to context constants. (Ken)
Signed-off-by: Jordan Justen
Reviewed-by: Kenneth Graunke (v1)
---
src/compiler/glsl/builtin_variables.cpp | 29 ++---
src/compiler/glsl/glsl_parser_extras.cpp | 2 +-
src/compiler/glsl/ir.h | 3 ++-
We need information about push constants in a few places for the GL
driver, and another couple places for the vulkan driver.
When we add support for uploading both a common (cross-thread) set of
push constants, combined with the previous per-thread push constant
data, things are going to get even
Signed-off-by: Jordan Justen
---
src/compiler/nir/nir.c | 4
src/compiler/nir/nir.h | 2 ++
src/compiler/nir/nir_gather_info.c | 1 +
src/compiler/nir/nir_intrinsics.h | 1 +
src/compiler/nir/nir_lower_system_values.c | 16
We added this support into nir a while ago in
a9e6213edd757980475167331bda15c3970a538d for Mesa's Intel vulkan
driver as part of the SPIR-V support, so we can use it for the i965
driver as well.
Signed-off-by: Jordan Justen
Reviewed-by: Kenneth Graunke
---
src/mesa/drivers/dri/i965/brw_context.
1 - 100 of 184 matches
Mail list logo