v2: - only add it if the ext is enabled (Ilia)
Signed-off-by: Samuel Pitoiset
Reviewed-by: Ian Romanick
Reviewed-by: Nicolai Hähnle
---
src/compiler/glsl/builtin_variables.cpp | 6 ++
src/compiler/shader_enums.h | 1 +
2 files changed, 7 insertions(+)
diff --git a/src
v3: - use a new case statement in r600_pipe_common.c
- fix compilation of softpipe...
Signed-off-by: Samuel Pitoiset
Reviewed-by: Marek Olšák
Reviewed-by: Nicolai Hähnle
---
src/gallium/docs/source/screen.rst | 4
src/gallium/drivers/ilo/ilo_screen.c | 2 ++
src
This extension is only exposed if the underlying driver supports
ARB_compute_shader and if PIPE_COMPUTE_MAX_VARIABLE_THREADS_PER_BLOCK
is set.
v3: - initialize max_variable_threads_per_block to 0
v2: - expose the ext based on that new cap
Signed-off-by: Samuel Pitoiset
Reviewed-by: Marek Olšák
On 10/05/2016 08:57 PM, Ilia Mirkin wrote:
On Wed, Oct 5, 2016 at 2:48 PM, Samuel Pitoiset
wrote:
Only expose 512 threads/block on Fermi to not be limited by
32 GPRs/thread.
v4: - use 512 threads on Fermi, 2014 on Kepler+
Dyslexics... untie!
Ahah! :)
Typo...
Signed-off-by: Samuel
On 10/06/2016 09:25 AM, Nicolai Hähnle wrote:
On 05.10.2016 20:48, Samuel Pitoiset wrote:
v4: - slightly indent spec quotes (Nicolai)
- drop useless _mesa_has_compute_shaders() check (Nicolai)
- move the fixed local size outside of the loop (Nicolai)
- add missing check for
On 10/06/2016 09:27 AM, Nicolai Hähnle wrote:
On 05.10.2016 20:48, Samuel Pitoiset wrote:
This is the new layout qualifier introduced by
ARB_compute_variable_group_size which allows to use a variable work
group size.
v4: - add missing '%s' in the monster format string
Signed-off-
Checking if MAD is supported is definitely wrong, and it's
more likely a typo I introduced few days ago which breaks
NV50 because SHLADD is not supported there.
Signed-off-by: Samuel Pitoiset
---
src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp | 2 +-
1 file changed, 1 insertion(
I will run piglit with that patch before pushing.
Reviewed-by: Samuel Pitoiset
On 10/06/2016 11:33 PM, Karol Herbst wrote:
total instructions in shared programs : 2818606 -> 2818227 (-0.01%)
total gprs used in shared programs: 379273 -> 379238 (-0.01%)
total local used in shared pr
Usually we prefix with gm107/ir, gk110/ir, etc...
More comments below.
On 10/08/2016 05:43 PM, Karol Herbst wrote:
Signed-off-by: Karol Herbst
---
.../drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp | 32 --
1 file changed, 23 insertions(+), 9 deletions(-)
diff --git a/src
"rework" is not the right term in my opinion. :)
On 10/08/2016 05:43 PM, Karol Herbst wrote:
we might want to add more folding passes here, so make it a bit more generic
Signed-off-by: Karol Herbst
---
.../drivers/nouveau/codegen/nv50_ir_peephole.cpp | 124 ++---
1 file chan
On 10/08/2016 05:43 PM, Karol Herbst wrote:
just little random noise in shader-db
Like what? Please elaborate.
will help in the next patch
Signed-off-by: Karol Herbst
---
src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --g
On 10/08/2016 05:43 PM, Karol Herbst wrote:
Signed-off-by: Karol Herbst
---
src/gallium/drivers/nouveau/codegen/nv50_ir.h| 2 +-
.../drivers/nouveau/codegen/nv50_ir_peephole.cpp | 20 +++-
2 files changed, 8 insertions(+), 14 deletions(-)
diff --git a/src/gallium
Please, update the prefix.
Also the same comment applies here, and I think the best way is to
enable that PostRAConstantFoldingPass for nvc0+ in a separate patch at
the end of that series. That way you won't break things and mupuf will
appreciate. :)
On 10/08/2016 05:43 PM, Karol Herbst wrot
This breaks a bunch of things, like:
spec/glsl-4.30/execution/built-in-functions/cs-all-bvec2-using-if: fail
spec/glsl-4.30/execution/built-in-functions/cs-all-bvec3-using-if: fail
spec/glsl-4.30/execution/built-in-functions/cs-all-bvec4-using-if: fail
spec/glsl-4.30/execution/built-in-functions/
On 10/08/2016 07:59 PM, Karol Herbst wrote:
2016-10-08 18:54 GMT+02:00 Samuel Pitoiset :
Please, update the prefix.
Also the same comment applies here, and I think the best way is to enable
that PostRAConstantFoldingPass for nvc0+ in a separate patch at the end of
that series. That way you
On 10/08/2016 09:26 PM, Ilia Mirkin wrote:
Pretty sure that the float one is fine. And there's a 20th bit, it
just behaves differently than one might expect. I don't remember all
the details though...
Yep, the float one is correct. The 20th bit is the sign bit, which is
correctly emitted in
852 852
hurt 0 44 23 23
Signed-off-by: Samuel Pitoiset
---
.../drivers/nouveau/codegen/nv50_ir_peephole.cpp | 94 ++
1 file changed, 94 insertions(+)
diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.c
This exposes OpenGL 4.5 on Fermi and Kepler GPUs. Maxwell still
only exposes OpenGL 4.1 because I need to finish my instructions
scheduler calculator.
Signed-off-by: Samuel Pitoiset
---
src/gallium/drivers/nouveau/nvc0/nvc0_screen.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff
All ARB_enhanced_layouts piglit tests pass without any changes
in our compiler.
Signed-off-by: Samuel Pitoiset
---
src/gallium/drivers/nouveau/nvc0/nvc0_screen.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c
b/src/gallium
This looks good to me, but fyi this doesn't fix the regressions
introduced with "nv50/ir: start LocalCSE with getFirst to merge PHI
instructions". Something else is probably wrong.
Reviewed-by: Samuel Pitoiset
overflowed the call stack, in case a function had a lot of early
ret
On 10/08/2016 10:04 PM, Karol Herbst wrote:
looks great, a few comments below
Thanks!
2016-10-08 21:55 GMT+02:00 Samuel Pitoiset :
total instructions in shared programs :2286901 -> 2284473 (-0.11%)
total gprs used in shared programs:335256 -> 335273 (0.01%)
total local used in
On 10/08/2016 10:09 PM, Ilia Mirkin wrote:
On Sat, Oct 8, 2016 at 3:55 PM, Samuel Pitoiset
wrote:
total instructions in shared programs :2286901 -> 2284473 (-0.11%)
total gprs used in shared programs:335256 -> 335273 (0.01%)
total local used in shared programs :31968 -> 31
This fixes a crash while replaying a trace from F1 2015.
Signed-off-by: Samuel Pitoiset
---
src/gallium/drivers/ddebug/dd_context.c | 11 +++
1 file changed, 11 insertions(+)
diff --git a/src/gallium/drivers/ddebug/dd_context.c
b/src/gallium/drivers/ddebug/dd_context.c
index edcbf2c
On 10/09/2016 09:28 PM, Karol Herbst wrote:
2016-10-09 13:58 GMT+02:00 Samuel Pitoiset :
On 10/08/2016 10:04 PM, Karol Herbst wrote:
looks great, a few comments below
Thanks!
2016-10-08 21:55 GMT+02:00 Samuel Pitoiset :
total instructions in shared programs :2286901 -> 2284
When offset != 0, the valid range was wrong because the second
argument of util_range_add() is end, not size.
Signed-off-by: Samuel Pitoiset
---
src/gallium/drivers/nouveau/nvc0/nvc0_compute.c| 1 +
src/gallium/drivers/nouveau/nvc0/nvc0_state_validate.c | 1 +
src/gallium/drivers
Confirmed.
Reviewed-by: Samuel Pitoiset
On 10/10/2016 06:12 PM, Ilia Mirkin wrote:
Normally the value is an immediate, which is moved to some temporary, so
there's no problem. In the case of a non-constant offset (as allowed by
ARB_gpu_shader5), we have to take care to copy it first b
852 852
hurt 0 44 23 23
v2: - use visit(Instruction *)
- use getUniqueInsn()
- use getImmediate()
- fix mod for src0
Signed-off-by: Samuel Pitoiset
---
.../drivers/nouveau/codegen/nv50_ir_peephole.cpp | 87 +
On 10/11/2016 11:17 PM, Ilia Mirkin wrote:
On Tue, Oct 11, 2016 at 5:01 PM, Samuel Pitoiset
wrote:
total instructions in shared programs :2286901 -> 2284473 (-0.11%)
total gprs used in shared programs:335256 -> 335273 (0.01%)
total local used in shared programs :31968 -> 31
Unfortunately, this introduces some regressions:
spec/arb_gpu_shader5/texturegatheroffset/fs-r-none-shadow-2d: fail
spec/arb_gpu_shader5/texturegatheroffset/fs-r-none-shadow-2darray: fail
spec/arb_gpu_shader5/texturegatheroffset/fs-r-none-shadow-2drect: fail
spec/arb_gpu_shader5/texturegatheroffs
Sounds reasonable.
Reviewed-by: Samuel Pitoiset
On 10/12/2016 02:51 AM, Ilia Mirkin wrote:
The offset needs to be properly copied over to the phi value, otherwise
it will get assigned to the base of the merge instead of the proper
location.
Signed-off-by: Ilia Mirkin
Cc: mesa-sta
You fixed the recent regressions, thanks!
Reviewed-by: Samuel Pitoiset
On 10/12/2016 04:26 PM, Ilia Mirkin wrote:
Recent fix for non-const offsets broke the case of a single offset (vs 4
offsets). The later code relies on the offs array to contain null values
to tell whether they should be
I think we could also use those copy modifiers in some other places.
Reviewed-by: Samuel Pitoiset
On 10/12/2016 07:32 PM, Ilia Mirkin wrote:
First off, src2 was being given the wrong modifier, and secondly we were
forgetting to clear src0's modifier. Instead let's use the
Valu
On 10/13/2016 10:31 AM, Andreas Boll wrote:
nak, neither radeonsi nor i965 advertise GLSL 4.50.
Nicolai hasn't pushed the patch to enable GLSL 4.50 [1].
I'm not sure what's the plan for nouveau is [2].
The plan is to not change docs/features.txt for nvc0 (not yet). :)
See also https://patc
t's
just for safety, but if the compiler adds modifiers to SHL something is
really wrong...
Looks good now.
Reviewed-by: Samuel Pitoiset
return false;
if (!shl->src(1).getImmediate(imm))
return false;
- mod[0] = add->src(0).mod;
- mod[1] = add->src(1)
On 10/13/2016 03:56 PM, Ilia Mirkin wrote:
On Thu, Oct 13, 2016 at 9:53 AM, Samuel Pitoiset
wrote:
On 10/12/2016 08:42 PM, Ilia Mirkin wrote:
src2 was being given the wrong modifier, and we were not properly
managing the modifier on the SHL source either.
Signed-off-by: Ilia Mirkin
On 08/14/2016 04:22 AM, Ilia Mirkin wrote:
It's fairly rare that the BB layout puts BBs after the exit block, which
is likely the reason these issues lingered for so long.
This fixes a fraction of issues with the giant pixmark piano shader.
This sounds reasonable to me.
Reviewed-by: S
Reviewed-by: Samuel Pitoiset
On 08/19/2016 06:45 AM, Ilia Mirkin wrote:
In some very specially-crafted cases, we could attempt to visit a node
that has already been visited, and then run out of bb's to visit, while
there were still cross blocks on the list. Make sure that those get
moved
GM107.
Signed-off-by: Samuel Pitoiset
CC:
---
src/gallium/drivers/nouveau/nvc0/nvc0_tex.c | 20
src/gallium/drivers/nouveau/nvc0/nve4_compute.c | 14 ++
2 files changed, 22 insertions(+), 12 deletions(-)
diff --git a/src/gallium/drivers/nouveau/nvc0
This fixes a divergent error found with F1 2015.
GM107 emitter already sets that bit.
Signed-off-by: Samuel Pitoiset
Cc:
---
src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gk110.cpp | 5 +
src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp | 5 +
2 files changed, 2
On 08/26/2016 01:58 PM, Kai Wasserbäch wrote:
Cc: Brian Paul
Signed-off-by: Kai Wasserbäch
---
Hi Brian,
is this what you had in mind? If so, I was wondering whether virgl_encode.c
would need to be updated as well. Doesn't seem like it, since the functions
there map everything to uint32_t or
On 08/26/2016 04:17 PM, Kai Wasserbäch wrote:
Hey Samuel,
Samuel Pitoiset wrote on 26.08.2016 15:54:
On 08/26/2016 01:58 PM, Kai Wasserbäch wrote:
[...]
diff --git a/src/gallium/drivers/nouveau/nv30/nv30_texture.c
b/src/gallium/drivers/nouveau/nv30/nv30_texture.c
index 4f4f87e..dc1a476
at 12:41 PM, Samuel Pitoiset
wrote:
This fixes a divergent error found with F1 2015.
GM107 emitter already sets that bit.
Signed-off-by: Samuel Pitoiset
Cc:
---
src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gk110.cpp | 5 +
src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp | 5 +-
Signed-off-by: Samuel Pitoiset
---
src/gallium/drivers/nouveau/nvc0/nvc0_screen.c | 16 +++-
1 file changed, 7 insertions(+), 9 deletions(-)
diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c
b/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c
index f139f66..b5be71a 100644
While we are at it, make it static and change the return values
policy to be consistent.
Signed-off-by: Samuel Pitoiset
---
src/gallium/drivers/nouveau/nvc0/nvc0_screen.c | 16
src/gallium/drivers/nouveau/nvc0/nvc0_screen.h | 3 ---
2 files changed, 8 insertions(+), 11
Trivial.
Signed-off-by: Samuel Pitoiset
---
src/gallium/drivers/nouveau/nvc0/nvc0_screen.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c
b/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c
index abf2d0f..9b0f4f5 100644
--- a/src
On 08/30/2016 04:53 PM, Ilia Mirkin wrote:
On Tue, Aug 30, 2016 at 10:45 AM, Samuel Pitoiset
wrote:
While we are at it, make it static and change the return values
policy to be consistent.
Signed-off-by: Samuel Pitoiset
---
src/gallium/drivers/nouveau/nvc0/nvc0_screen.c | 16
This has never been used because info->immd.bufSize is always 0
and anyways this is an experimental code which has never been
completed.
This gets rid of some unused code in the program validation process.
Signed-off-by: Samuel Pitoiset
---
src/gallium/drivers/nouveau/nvc0/nvc0_progra
Signed-off-by: Samuel Pitoiset
---
src/gallium/drivers/nouveau/nv50/nv50_program.h | 1 -
1 file changed, 1 deletion(-)
diff --git a/src/gallium/drivers/nouveau/nv50/nv50_program.h
b/src/gallium/drivers/nouveau/nv50/nv50_program.h
index fc9ada4..009d41f 100644
--- a/src/gallium/drivers/nouveau
This function will be helpful for resizing the code segment
area when we need to evict all shaders.
Signed-off-by: Samuel Pitoiset
---
src/gallium/drivers/nouveau/nvc0/nvc0_context.h | 1 +
src/gallium/drivers/nouveau/nvc0/nvc0_screen.c | 47 +++--
src/gallium/drivers
To avoid the bins list to grow up indefinitely when the code segment
size will be bumped, we need to separate that bin from the SCREEN
one because it contains other resources like the uniform bo.
Signed-off-by: Samuel Pitoiset
---
src/gallium/drivers/nouveau/nvc0/nvc0_context.c | 4 ++--
src
. The maximum size is arbitrary
fixed to 8MB which should be enough.
Signed-off-by: Samuel Pitoiset
---
src/gallium/drivers/nouveau/nvc0/nvc0_program.c | 25 -
1 file changed, 24 insertions(+), 1 deletion(-)
diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_program.c
b
Signed-off-by: Samuel Pitoiset
---
src/gallium/drivers/nouveau/nvc0/nvc0_screen.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c
b/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c
index 6c6d177..0627f3d 100644
--- a/src/gallium
This refactoring will help for fixing the "out of code space"
eviction issue because we will need to reupload the code for
all currently bound shaders but it's slightly different than
uploading a new fresh code.
Signed-off-by: Samuel Pitoiset
---
src/gallium/drivers/nouveau/nvc0
be re-uploaded and SP_START_ID
have to be updated accordingly.
Signed-off-by: Samuel Pitoiset
---
src/gallium/drivers/nouveau/nvc0/nvc0_program.c | 27 +
1 file changed, 27 insertions(+)
diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_program.c
b/src/gallium/drivers
On 08/31/2016 11:31 PM, Ilia Mirkin wrote:
On Wed, Aug 31, 2016 at 4:52 PM, Samuel Pitoiset
wrote:
This function will be helpful for resizing the code segment
area when we need to evict all shaders.
Signed-off-by: Samuel Pitoiset
---
src/gallium/drivers/nouveau/nvc0/nvc0_context.h | 1
On 08/31/2016 11:36 PM, Ilia Mirkin wrote:
On Wed, Aug 31, 2016 at 4:52 PM, Samuel Pitoiset
wrote:
This function will be helpful for resizing the code segment
area when we need to evict all shaders.
Signed-off-by: Samuel Pitoiset
---
src/gallium/drivers/nouveau/nvc0/nvc0_context.h | 1
On 09/01/2016 06:22 PM, Ilia Mirkin wrote:
On Thu, Sep 1, 2016 at 12:14 PM, Samuel Pitoiset
wrote:
On 08/31/2016 11:36 PM, Ilia Mirkin wrote:
On Wed, Aug 31, 2016 at 4:52 PM, Samuel Pitoiset
wrote:
This function will be helpful for resizing the code segment
area when we need to evict
This is the new layout qualifier introduced by
ARB_compute_variable_group_size which allows to use a variable work
group size.
Signed-off-by: Samuel Pitoiset
---
src/compiler/glsl/ast.h | 5 +
src/compiler/glsl/ast_type.cpp | 6 ++
src/compiler/glsl
shaders tests.
Marek, Nicolai and other AMD folks, I don't know if radeonsi will need a fix
somewhere for handling a variable work group size, but as I don't have the
hardware, I can't test. Let me know if something needs to be slighty updated.
Please review,
Thanks!
Samuel Pitoiset (11
Compute shaders can now include a fixed local size as defined by
ARB_compute_shader or a variable size as defined by
ARB_compute_variable_group_size.
Signed-off-by: Samuel Pitoiset
---
src/compiler/glsl/linker.cpp | 23 +--
1 file changed, 21 insertions(+), 2 deletions
gl_LocalGroupSizeARB can be translated into TGSI_SEMANTIC_BLOCK_SIZE
which represents the block size in threads.
Signed-off-by: Samuel Pitoiset
---
src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 2 ++
1 file changed, 2 insertions(+)
diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
b/src
This extension is only exposed if the underlying driver supports
ARB_compute_shader.
Signed-off-by: Samuel Pitoiset
---
src/mesa/state_tracker/st_extensions.c | 13 +
1 file changed, 13 insertions(+)
diff --git a/src/mesa/state_tracker/st_extensions.c
b/src/mesa/state_tracker
Signed-off-by: Samuel Pitoiset
---
src/compiler/glsl/builtin_variables.cpp | 2 ++
src/compiler/shader_enums.h | 1 +
2 files changed, 3 insertions(+)
diff --git a/src/compiler/glsl/builtin_variables.cpp
b/src/compiler/glsl/builtin_variables.cpp
index f47daab..a1768fc 100644
--- a
Signed-off-by: Samuel Pitoiset
---
.../glapi/gen/ARB_compute_variable_group_size.xml | 25 ++
src/mapi/glapi/gen/Makefile.am | 1 +
src/mapi/glapi/gen/gl_API.xml | 2 ++
src/mesa/main/compute.c| 8
The ARB_compute_variable_group_size specification explains that
when a compute shader includes both a fixed and a variable local
size, a compile-time error occurs.
Signed-off-by: Samuel Pitoiset
---
src/compiler/glsl/ast_to_hir.cpp | 14 ++
1 file changed, 14 insertions(+)
diff
Signed-off-by: Samuel Pitoiset
---
src/mesa/main/api_validate.c | 94
src/mesa/main/api_validate.h | 4 ++
src/mesa/main/compute.c | 17
src/mesa/main/context.c | 6 +++
src/mesa/main/dd.h | 9
src
When a variable local size is defined as specified by
ARB_compute_variable_group_size, the fixed local size is set to 0
and a SIGFPE occurs when we compute the maximum number of regs.
Signed-off-by: Samuel Pitoiset
---
src/gallium/drivers/nouveau/codegen/nv50_ir_target.h | 3 ++-
1 file changed
This also initializes the default values for the standalone compiler.
Signed-off-by: Samuel Pitoiset
---
src/compiler/glsl/glsl_parser_extras.cpp | 1 +
src/compiler/glsl/glsl_parser_extras.h | 2 ++
src/compiler/glsl/standalone.cpp | 4
src/compiler/glsl
Signed-off-by: Samuel Pitoiset
---
src/mesa/state_tracker/st_cb_compute.c | 15 ---
1 file changed, 12 insertions(+), 3 deletions(-)
diff --git a/src/mesa/state_tracker/st_cb_compute.c
b/src/mesa/state_tracker/st_cb_compute.c
index 88c1ee2..ccc5dc2 100644
--- a/src/mesa
On 09/08/2016 10:58 PM, Ian Romanick wrote:
On 09/08/2016 01:31 PM, Samuel Pitoiset wrote:
Signed-off-by: Samuel Pitoiset
---
.../glapi/gen/ARB_compute_variable_group_size.xml | 25 ++
src/mapi/glapi/gen/Makefile.am | 1 +
src/mapi/glapi/gen
On 09/08/2016 10:58 PM, Ian Romanick wrote:
On 09/08/2016 01:31 PM, Samuel Pitoiset wrote:
Signed-off-by: Samuel Pitoiset
---
src/mesa/main/api_validate.c | 94
src/mesa/main/api_validate.h | 4 ++
src/mesa/main/compute.c | 17
ns and the
extension can be exposed based on that CAP.
Fine by me, I will add this new cap.
Thanks for reviewing.
Marek
On Thu, Sep 8, 2016 at 10:31 PM, Samuel Pitoiset
wrote:
This extension is only exposed if the underlying driver supports
ARB_compute_shader.
Signed-off-by: Samuel Pitoiset
---
Also, the nouveau enablement patch should come _before_ the patch that
turns on the extension...
Good catch, thanks. :)
Cheers,
Nicolai
On 08.09.2016 22:31, Samuel Pitoiset wrote:
Hi,
This series implements ARB_compute_variable_group_size written against
GL 4.3.
This extension allows to dis
On 09/09/2016 10:12 AM, Nicolai Hähnle wrote:
From: Nicolai Hähnle
Not sure if it's possible to avoid programming the block size twice (once for
the userdata and once for the dispatch).
Since the shaders are compiled with a pessimistic upper limit on the number of
registers, asynchronously c
On 09/09/2016 02:37 PM, Ilia Mirkin wrote:
On Fri, Sep 9, 2016 at 8:29 AM, Marek Olšák wrote:
On Fri, Sep 9, 2016 at 10:12 AM, Nicolai Hähnle wrote:
From: Nicolai Hähnle
Not sure if it's possible to avoid programming the block size twice (once for
the userdata and once for the dispatch).
On 09/09/2016 06:31 PM, Marek Olšák wrote:
On Fri, Sep 9, 2016 at 5:46 PM, Samuel Pitoiset
wrote:
On 09/08/2016 10:58 PM, Ian Romanick wrote:
On 09/08/2016 01:31 PM, Samuel Pitoiset wrote:
Signed-off-by: Samuel Pitoiset
---
.../glapi/gen/ARB_compute_variable_group_size.xml | 25
On 09/09/2016 08:22 PM, Ian Romanick wrote:
On 09/09/2016 08:46 AM, Samuel Pitoiset wrote:
On 09/08/2016 10:58 PM, Ian Romanick wrote:
On 09/08/2016 01:31 PM, Samuel Pitoiset wrote:
Signed-off-by: Samuel Pitoiset
---
.../glapi/gen/ARB_compute_variable_group_size.xml | 25
On 09/09/2016 08:46 PM, Ian Romanick wrote:
On 09/08/2016 01:31 PM, Samuel Pitoiset wrote:
The ARB_compute_variable_group_size specification explains that
when a compute shader includes both a fixed and a variable local
size, a compile-time error occurs.
I probably would have squashed this
On 09/09/2016 08:50 PM, Ilia Mirkin wrote:
On Thu, Sep 8, 2016 at 4:31 PM, Samuel Pitoiset
wrote:
Signed-off-by: Samuel Pitoiset
---
src/compiler/glsl/builtin_variables.cpp | 2 ++
src/compiler/shader_enums.h | 1 +
2 files changed, 3 insertions(+)
diff --git a/src/compiler
Cc: Nicolai Hähnle
Signed-off-by: Samuel Pitoiset
---
src/gallium/docs/source/tgsi.rst | 26 ++
1 file changed, 26 insertions(+)
diff --git a/src/gallium/docs/source/tgsi.rst b/src/gallium/docs/source/tgsi.rst
index c6e5ceb..d83cf9a 100644
--- a/src/gallium/docs/source
On 09/10/2016 05:01 PM, Ilia Mirkin wrote:
On Sat, Sep 10, 2016 at 10:05 AM, Samuel Pitoiset
wrote:
Cc: Nicolai Hähnle
Signed-off-by: Samuel Pitoiset
---
src/gallium/docs/source/tgsi.rst | 26 ++
1 file changed, 26 insertions(+)
diff --git a/src/gallium/docs
Cc: Nicolai Hähnle
Signed-off-by: Samuel Pitoiset
---
src/gallium/docs/source/tgsi.rst | 28
1 file changed, 28 insertions(+)
diff --git a/src/gallium/docs/source/tgsi.rst b/src/gallium/docs/source/tgsi.rst
index c6e5ceb..881aef6 100644
--- a/src/gallium/docs
This series is:
Reviewed-by: Samuel Pitoiset
On 09/10/2016 06:58 PM, Ilia Mirkin wrote:
We have to force it to write to predicate 7 (aka PT) in order for it not
to mess up another predicate. Unclear what would be returned in the
predicate, perhaps an error code for out-of-bounds requests
On 09/08/2016 10:58 PM, Ian Romanick wrote:
On 09/08/2016 01:31 PM, Samuel Pitoiset wrote:
Signed-off-by: Samuel Pitoiset
---
src/mesa/main/api_validate.c | 94
src/mesa/main/api_validate.h | 4 ++
src/mesa/main/compute.c | 17
This also initializes the default values for the standalone compiler.
Signed-off-by: Samuel Pitoiset
Reviewed-by: Ian Romanick
---
src/compiler/glsl/glsl_parser_extras.cpp | 1 +
src/compiler/glsl/glsl_parser_extras.h | 2 ++
src/compiler/glsl/standalone.cpp | 4
src
Compute shaders can now include a fixed local size as defined by
ARB_compute_shader or a variable size as defined by
ARB_compute_variable_group_size.
v2: - update formatting spec quotations (Ian)
- various cosmetic changes (Ian)
Signed-off-by: Samuel Pitoiset
Reviewed-by: Ian Romanick
v2: - only add it if the ext is enabled (Ilia)
Signed-off-by: Samuel Pitoiset
Reviewed-by: Ian Romanick
---
src/compiler/glsl/builtin_variables.cpp | 6 ++
src/compiler/shader_enums.h | 1 +
2 files changed, 7 insertions(+)
diff --git a/src/compiler/glsl/builtin_variables.cpp
This is the new layout qualifier introduced by
ARB_compute_variable_group_size which allows to use a variable work
group size.
Signed-off-by: Samuel Pitoiset
Reviewed-by: Ian Romanick
---
src/compiler/glsl/ast.h | 5 +
src/compiler/glsl/ast_type.cpp | 6
v2: - correctly sort that new extension (Ian)
- fix up the comment (Ian)
Signed-off-by: Samuel Pitoiset
Reviewed-by: Ian Romanick
---
.../glapi/gen/ARB_compute_variable_group_size.xml | 25 ++
src/mapi/glapi/gen/Makefile.am | 1 +
src/mapi/glapi
eonsi will need a fix
somewhere for handling a variable work group size, but as I don't have the
hardware, I can't test. Let me know if something needs to be slighty updated.
Please review,
Thanks!
Samuel Pitoiset (14):
glapi: add entry points for GL_ARB_compute_variable_group_size
The ARB_compute_variable_group_size specification explains that
when a compute shader includes both a fixed and a variable local
size, a compile-time error occurs.
v2: - update formatting spec quotations (Ian)
Signed-off-by: Samuel Pitoiset
---
src/compiler/glsl/ast_to_hir.cpp | 14
This extension is only exposed if the underlying driver supports
ARB_compute_shader and if PIPE_COMPUTE_MAX_VARIABLE_THREADS_PER_BLOCK
is set.
v2: - expose the ext based on that new cap
Signed-off-by: Samuel Pitoiset
---
src/mesa/state_tracker/st_extensions.c | 22 ++
1
v2: - update formatting spec quotations (Ian)
- move the total_invocations check outside of the loop (Ian)
Signed-off-by: Samuel Pitoiset
---
src/mesa/main/api_validate.c | 96
src/mesa/main/api_validate.h | 4 ++
src/mesa/main/compute.c
gl_LocalGroupSizeARB can be translated into TGSI_SEMANTIC_BLOCK_SIZE
which represents the block size in threads.
Signed-off-by: Samuel Pitoiset
Reviewed-by: Marek Olšák
---
src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 2 ++
1 file changed, 2 insertions(+)
diff --git a/src/mesa/state_tracker
Let's return the same number of threads per block for both fixed and
variable sizes.
Signed-off-by: Samuel Pitoiset
---
src/gallium/drivers/nouveau/nvc0/nvc0_screen.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c
Signed-off-by: Samuel Pitoiset
---
src/gallium/docs/source/screen.rst | 4
src/gallium/drivers/ilo/ilo_screen.c | 2 ++
src/gallium/drivers/nouveau/nv50/nv50_screen.c | 2 ++
src/gallium/drivers/nouveau/nvc0/nvc0_screen.c | 2 ++
src/gallium/drivers/radeon
Signed-off-by: Samuel Pitoiset
---
docs/features.txt | 2 +-
docs/relnotes/12.1.0.html | 1 +
2 files changed, 2 insertions(+), 1 deletion(-)
diff --git a/docs/features.txt b/docs/features.txt
index 690c160..3825943 100644
--- a/docs/features.txt
+++ b/docs/features.txt
@@ -279,7 +279,7
Signed-off-by: Samuel Pitoiset
Reviewed-by: Marek Olšák
---
src/mesa/state_tracker/st_cb_compute.c | 15 ---
1 file changed, 12 insertions(+), 3 deletions(-)
diff --git a/src/mesa/state_tracker/st_cb_compute.c
b/src/mesa/state_tracker/st_cb_compute.c
index 88c1ee2..ccc5dc2 100644
When a variable local size is defined as specified by
ARB_compute_variable_group_size, the fixed local size is set to 0
and a SIGFPE occurs when we compute the maximum number of regs.
Signed-off-by: Samuel Pitoiset
---
src/gallium/drivers/nouveau/codegen/nv50_ir_target.h | 3 ++-
1 file changed
On 09/12/2016 05:35 PM, Nicolai Hähnle wrote:
On 11.09.2016 20:45, Samuel Pitoiset wrote:
Signed-off-by: Samuel Pitoiset
---
src/gallium/docs/source/screen.rst | 4
src/gallium/drivers/ilo/ilo_screen.c | 2 ++
src/gallium/drivers/nouveau/nv50/nv50_screen.c | 2
301 - 400 of 5029 matches
Mail list logo