[Mesa-dev] [PATCH v4 07/14] glsl: add gl_LocalGroupSizeARB as a system value

2016-10-05 Thread Samuel Pitoiset
v2: - only add it if the ext is enabled (Ilia) Signed-off-by: Samuel Pitoiset Reviewed-by: Ian Romanick Reviewed-by: Nicolai Hähnle --- src/compiler/glsl/builtin_variables.cpp | 6 ++ src/compiler/shader_enums.h | 1 + 2 files changed, 7 insertions(+) diff --git a/src

[Mesa-dev] [PATCH v4 08/14] gallium: add PIPE_COMPUTE_CAP_MAX_VARIABLE_THREADS_PER_BLOCK

2016-10-05 Thread Samuel Pitoiset
v3: - use a new case statement in r600_pipe_common.c - fix compilation of softpipe... Signed-off-by: Samuel Pitoiset Reviewed-by: Marek Olšák Reviewed-by: Nicolai Hähnle --- src/gallium/docs/source/screen.rst | 4 src/gallium/drivers/ilo/ilo_screen.c | 2 ++ src

[Mesa-dev] [PATCH v4 11/14] st/mesa: expose ARB_compute_variable_group_size

2016-10-05 Thread Samuel Pitoiset
This extension is only exposed if the underlying driver supports ARB_compute_shader and if PIPE_COMPUTE_MAX_VARIABLE_THREADS_PER_BLOCK is set. v3: - initialize max_variable_threads_per_block to 0 v2: - expose the ext based on that new cap Signed-off-by: Samuel Pitoiset Reviewed-by: Marek Olšák

Re: [Mesa-dev] [PATCH v4 13/14] nvc0: expose ARB_compute_variable_group_size

2016-10-05 Thread Samuel Pitoiset
On 10/05/2016 08:57 PM, Ilia Mirkin wrote: On Wed, Oct 5, 2016 at 2:48 PM, Samuel Pitoiset wrote: Only expose 512 threads/block on Fermi to not be limited by 32 GPRs/thread. v4: - use 512 threads on Fermi, 2014 on Kepler+ Dyslexics... untie! Ahah! :) Typo... Signed-off-by: Samuel

Re: [Mesa-dev] [PATCH v4 02/14] mesa/main: add support for ARB_compute_variable_groups_size

2016-10-06 Thread Samuel Pitoiset
On 10/06/2016 09:25 AM, Nicolai Hähnle wrote: On 05.10.2016 20:48, Samuel Pitoiset wrote: v4: - slightly indent spec quotes (Nicolai) - drop useless _mesa_has_compute_shaders() check (Nicolai) - move the fixed local size outside of the loop (Nicolai) - add missing check for

Re: [Mesa-dev] [PATCH v4 04/14] glsl: process local_size_variable input qualifier

2016-10-06 Thread Samuel Pitoiset
On 10/06/2016 09:27 AM, Nicolai Hähnle wrote: On 05.10.2016 20:48, Samuel Pitoiset wrote: This is the new layout qualifier introduced by ARB_compute_variable_group_size which allows to use a variable work group size. v4: - add missing '%s' in the monster format string Signed-off-

[Mesa-dev] [PATCH] nv50/ir: fix wrong check when optimizing MAD to SHLADD

2016-10-06 Thread Samuel Pitoiset
Checking if MAD is supported is definitely wrong, and it's more likely a typo I introduced few days ago which breaks NV50 because SHLADD is not supported there. Signed-off-by: Samuel Pitoiset --- src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp | 2 +- 1 file changed, 1 insertion(

Re: [Mesa-dev] [PATCH] nv50/ir: start LocalCSE with getFirst to merge PHI instructions

2016-10-07 Thread Samuel Pitoiset
I will run piglit with that patch before pushing. Reviewed-by: Samuel Pitoiset On 10/06/2016 11:33 PM, Karol Herbst wrote: total instructions in shared programs : 2818606 -> 2818227 (-0.01%) total gprs used in shared programs: 379273 -> 379238 (-0.01%) total local used in shared pr

Re: [Mesa-dev] [PATCH 2/6] nv50/ir: add LIMM form of mad to gm107

2016-10-08 Thread Samuel Pitoiset
Usually we prefix with gm107/ir, gk110/ir, etc... More comments below. On 10/08/2016 05:43 PM, Karol Herbst wrote: Signed-off-by: Karol Herbst --- .../drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp | 32 -- 1 file changed, 23 insertions(+), 9 deletions(-) diff --git a/src

Re: [Mesa-dev] [PATCH 4/6] nv50/ir: rework postraconstantfolding pass

2016-10-08 Thread Samuel Pitoiset
"rework" is not the right term in my opinion. :) On 10/08/2016 05:43 PM, Karol Herbst wrote: we might want to add more folding passes here, so make it a bit more generic Signed-off-by: Karol Herbst --- .../drivers/nouveau/codegen/nv50_ir_peephole.cpp | 124 ++--- 1 file chan

Re: [Mesa-dev] [PATCH 5/6] nv50/ra: always prefer def == src2 for mad/sad

2016-10-08 Thread Samuel Pitoiset
On 10/08/2016 05:43 PM, Karol Herbst wrote: just little random noise in shader-db Like what? Please elaborate. will help in the next patch Signed-off-by: Karol Herbst --- src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --g

Re: [Mesa-dev] [PATCH 3/6] nv50/ir: replace post_ra_dead by Instruction::isDead

2016-10-08 Thread Samuel Pitoiset
On 10/08/2016 05:43 PM, Karol Herbst wrote: Signed-off-by: Karol Herbst --- src/gallium/drivers/nouveau/codegen/nv50_ir.h| 2 +- .../drivers/nouveau/codegen/nv50_ir_peephole.cpp | 20 +++- 2 files changed, 8 insertions(+), 14 deletions(-) diff --git a/src/gallium

Re: [Mesa-dev] [PATCH 1/6] nv50/ir: add LIMM form of mad to gk110

2016-10-08 Thread Samuel Pitoiset
Please, update the prefix. Also the same comment applies here, and I think the best way is to enable that PostRAConstantFoldingPass for nvc0+ in a separate patch at the end of that series. That way you won't break things and mupuf will appreciate. :) On 10/08/2016 05:43 PM, Karol Herbst wrot

Re: [Mesa-dev] [PATCH] nv50/ir: start LocalCSE with getFirst to merge PHI instructions

2016-10-08 Thread Samuel Pitoiset
This breaks a bunch of things, like: spec/glsl-4.30/execution/built-in-functions/cs-all-bvec2-using-if: fail spec/glsl-4.30/execution/built-in-functions/cs-all-bvec3-using-if: fail spec/glsl-4.30/execution/built-in-functions/cs-all-bvec4-using-if: fail spec/glsl-4.30/execution/built-in-functions/

Re: [Mesa-dev] [PATCH 1/6] nv50/ir: add LIMM form of mad to gk110

2016-10-08 Thread Samuel Pitoiset
On 10/08/2016 07:59 PM, Karol Herbst wrote: 2016-10-08 18:54 GMT+02:00 Samuel Pitoiset : Please, update the prefix. Also the same comment applies here, and I think the best way is to enable that PostRAConstantFoldingPass for nvc0+ in a separate patch at the end of that series. That way you

Re: [Mesa-dev] [PATCH] gf100/ir: limms on gm107 are 19 bit

2016-10-08 Thread Samuel Pitoiset
On 10/08/2016 09:26 PM, Ilia Mirkin wrote: Pretty sure that the float one is fine. And there's a 20th bit, it just behaves differently than one might expect. I don't remember all the details though... Yep, the float one is correct. The 20th bit is the sign bit, which is correctly emitted in

[Mesa-dev] [PATCH] nv50/ir: optimize ADD(SHL(a, b), c) to SHLADD(a, b, c)

2016-10-08 Thread Samuel Pitoiset
852 852 hurt 0 44 23 23 Signed-off-by: Samuel Pitoiset --- .../drivers/nouveau/codegen/nv50_ir_peephole.cpp | 94 ++ 1 file changed, 94 insertions(+) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.c

[Mesa-dev] [PATCH 2/2] nvc0: enable GLSL 4.5

2016-10-09 Thread Samuel Pitoiset
This exposes OpenGL 4.5 on Fermi and Kepler GPUs. Maxwell still only exposes OpenGL 4.1 because I need to finish my instructions scheduler calculator. Signed-off-by: Samuel Pitoiset --- src/gallium/drivers/nouveau/nvc0/nvc0_screen.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff

[Mesa-dev] [PATCH 1/2] nvc0: enable ARB_enhanced_layouts

2016-10-09 Thread Samuel Pitoiset
All ARB_enhanced_layouts piglit tests pass without any changes in our compiler. Signed-off-by: Samuel Pitoiset --- src/gallium/drivers/nouveau/nvc0/nvc0_screen.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c b/src/gallium

Re: [Mesa-dev] [PATCH] nv50/ir: only stick one preret per function

2016-10-09 Thread Samuel Pitoiset
This looks good to me, but fyi this doesn't fix the regressions introduced with "nv50/ir: start LocalCSE with getFirst to merge PHI instructions". Something else is probably wrong. Reviewed-by: Samuel Pitoiset overflowed the call stack, in case a function had a lot of early ret

Re: [Mesa-dev] [PATCH] nv50/ir: optimize ADD(SHL(a, b), c) to SHLADD(a, b, c)

2016-10-09 Thread Samuel Pitoiset
On 10/08/2016 10:04 PM, Karol Herbst wrote: looks great, a few comments below Thanks! 2016-10-08 21:55 GMT+02:00 Samuel Pitoiset : total instructions in shared programs :2286901 -> 2284473 (-0.11%) total gprs used in shared programs:335256 -> 335273 (0.01%) total local used in

Re: [Mesa-dev] [PATCH] nv50/ir: optimize ADD(SHL(a, b), c) to SHLADD(a, b, c)

2016-10-09 Thread Samuel Pitoiset
On 10/08/2016 10:09 PM, Ilia Mirkin wrote: On Sat, Oct 8, 2016 at 3:55 PM, Samuel Pitoiset wrote: total instructions in shared programs :2286901 -> 2284473 (-0.11%) total gprs used in shared programs:335256 -> 335273 (0.01%) total local used in shared programs :31968 -> 31

[Mesa-dev] [PATCH] ddebug: add missing pipe_context::clear_texture()

2016-10-09 Thread Samuel Pitoiset
This fixes a crash while replaying a trace from F1 2015. Signed-off-by: Samuel Pitoiset --- src/gallium/drivers/ddebug/dd_context.c | 11 +++ 1 file changed, 11 insertions(+) diff --git a/src/gallium/drivers/ddebug/dd_context.c b/src/gallium/drivers/ddebug/dd_context.c index edcbf2c

Re: [Mesa-dev] [PATCH] nv50/ir: optimize ADD(SHL(a, b), c) to SHLADD(a, b, c)

2016-10-09 Thread Samuel Pitoiset
On 10/09/2016 09:28 PM, Karol Herbst wrote: 2016-10-09 13:58 GMT+02:00 Samuel Pitoiset : On 10/08/2016 10:04 PM, Karol Herbst wrote: looks great, a few comments below Thanks! 2016-10-08 21:55 GMT+02:00 Samuel Pitoiset : total instructions in shared programs :2286901 -> 2284

[Mesa-dev] [PATCH] nvc0: fix valid range for shader buffers

2016-10-09 Thread Samuel Pitoiset
When offset != 0, the valid range was wrong because the second argument of util_range_add() is end, not size. Signed-off-by: Samuel Pitoiset --- src/gallium/drivers/nouveau/nvc0/nvc0_compute.c| 1 + src/gallium/drivers/nouveau/nvc0/nvc0_state_validate.c | 1 + src/gallium/drivers

Re: [Mesa-dev] [PATCH] nvc0/ir: fix overwriting of value backing non-constant gather offset

2016-10-10 Thread Samuel Pitoiset
Confirmed. Reviewed-by: Samuel Pitoiset On 10/10/2016 06:12 PM, Ilia Mirkin wrote: Normally the value is an immediate, which is moved to some temporary, so there's no problem. In the case of a non-constant offset (as allowed by ARB_gpu_shader5), we have to take care to copy it first b

[Mesa-dev] [PATCH v2] nv50/ir: optimize ADD(SHL(a, b), c) to SHLADD(a, b, c)

2016-10-11 Thread Samuel Pitoiset
852 852 hurt 0 44 23 23 v2: - use visit(Instruction *) - use getUniqueInsn() - use getImmediate() - fix mod for src0 Signed-off-by: Samuel Pitoiset --- .../drivers/nouveau/codegen/nv50_ir_peephole.cpp | 87 +

Re: [Mesa-dev] [PATCH v2] nv50/ir: optimize ADD(SHL(a, b), c) to SHLADD(a, b, c)

2016-10-11 Thread Samuel Pitoiset
On 10/11/2016 11:17 PM, Ilia Mirkin wrote: On Tue, Oct 11, 2016 at 5:01 PM, Samuel Pitoiset wrote: total instructions in shared programs :2286901 -> 2284473 (-0.11%) total gprs used in shared programs:335256 -> 335273 (0.01%) total local used in shared programs :31968 -> 31

Re: [Mesa-dev] [PATCH] nvc0/ir: fix overwriting of value backing non-constant gather offset

2016-10-12 Thread Samuel Pitoiset
Unfortunately, this introduces some regressions: spec/arb_gpu_shader5/texturegatheroffset/fs-r-none-shadow-2d: fail spec/arb_gpu_shader5/texturegatheroffset/fs-r-none-shadow-2darray: fail spec/arb_gpu_shader5/texturegatheroffset/fs-r-none-shadow-2drect: fail spec/arb_gpu_shader5/texturegatheroffs

Re: [Mesa-dev] [PATCH] nv50/ir: copy over value's register id when resolving merge of a phi

2016-10-12 Thread Samuel Pitoiset
Sounds reasonable. Reviewed-by: Samuel Pitoiset On 10/12/2016 02:51 AM, Ilia Mirkin wrote: The offset needs to be properly copied over to the phi value, otherwise it will get assigned to the base of the merge instead of the proper location. Signed-off-by: Ilia Mirkin Cc: mesa-sta

Re: [Mesa-dev] [PATCH] nvc0/ir: fix textureGather with a single offset

2016-10-12 Thread Samuel Pitoiset
You fixed the recent regressions, thanks! Reviewed-by: Samuel Pitoiset On 10/12/2016 04:26 PM, Ilia Mirkin wrote: Recent fix for non-const offsets broke the case of a single offset (vs 4 offsets). The later code relies on the offs array to contain null values to tell whether they should be

Re: [Mesa-dev] [PATCH] nvc0/ir: be more careful about preserving modifiers in SHLADD creation

2016-10-12 Thread Samuel Pitoiset
I think we could also use those copy modifiers in some other places. Reviewed-by: Samuel Pitoiset On 10/12/2016 07:32 PM, Ilia Mirkin wrote: First off, src2 was being given the wrong modifier, and secondly we were forgetting to clear src0's modifier. Instead let's use the Valu

Re: [Mesa-dev] [PATCH v2] doc/features.txt: factor out nvc0/radeonsi as GL45 complete

2016-10-13 Thread Samuel Pitoiset
On 10/13/2016 10:31 AM, Andreas Boll wrote: nak, neither radeonsi nor i965 advertise GLSL 4.50. Nicolai hasn't pushed the patch to enable GLSL 4.50 [1]. I'm not sure what's the plan for nouveau is [2]. The plan is to not change docs/features.txt for nvc0 (not yet). :) See also https://patc

Re: [Mesa-dev] [PATCH v2] nvc0/ir: be more careful about preserving modifiers in SHLADD creation

2016-10-13 Thread Samuel Pitoiset
t's just for safety, but if the compiler adds modifiers to SHL something is really wrong... Looks good now. Reviewed-by: Samuel Pitoiset return false; if (!shl->src(1).getImmediate(imm)) return false; - mod[0] = add->src(0).mod; - mod[1] = add->src(1)

Re: [Mesa-dev] [PATCH v2] nvc0/ir: be more careful about preserving modifiers in SHLADD creation

2016-10-13 Thread Samuel Pitoiset
On 10/13/2016 03:56 PM, Ilia Mirkin wrote: On Thu, Oct 13, 2016 at 9:53 AM, Samuel Pitoiset wrote: On 10/12/2016 08:42 PM, Ilia Mirkin wrote: src2 was being given the wrong modifier, and we were not properly managing the modifier on the SHL source either. Signed-off-by: Ilia Mirkin

Re: [Mesa-dev] [PATCH] nv50/ir: fix bb positions after exit instructions

2016-08-16 Thread Samuel Pitoiset
On 08/14/2016 04:22 AM, Ilia Mirkin wrote: It's fairly rare that the BB layout puts BBs after the exit block, which is likely the reason these issues lingered for so long. This fixes a fraction of issues with the giant pixmark piano shader. This sounds reasonable to me. Reviewed-by: S

Re: [Mesa-dev] [PATCH] nv50/ir: make sure cfg iterator always hits all blocks

2016-08-22 Thread Samuel Pitoiset
Reviewed-by: Samuel Pitoiset On 08/19/2016 06:45 AM, Ilia Mirkin wrote: In some very specially-crafted cases, we could attempt to visit a node that has already been visited, and then run out of bb's to visit, while there were still cross blocks on the list. Make sure that those get moved

[Mesa-dev] [PATCH] nvc0: invalidate textures/samplers on GK104+

2016-08-24 Thread Samuel Pitoiset
GM107. Signed-off-by: Samuel Pitoiset CC: --- src/gallium/drivers/nouveau/nvc0/nvc0_tex.c | 20 src/gallium/drivers/nouveau/nvc0/nve4_compute.c | 14 ++ 2 files changed, 22 insertions(+), 12 deletions(-) diff --git a/src/gallium/drivers/nouveau/nvc0

[Mesa-dev] [PATCH] nv50/ir: always emit the NDV bit for OP_QUADOP

2016-08-25 Thread Samuel Pitoiset
This fixes a divergent error found with F1 2015. GM107 emitter already sets that bit. Signed-off-by: Samuel Pitoiset Cc: --- src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gk110.cpp | 5 + src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp | 5 + 2 files changed, 2

Re: [Mesa-dev] [PATCH] gallium: Use enum pipe_shader_type in bind_sampler_states()

2016-08-26 Thread Samuel Pitoiset
On 08/26/2016 01:58 PM, Kai Wasserbäch wrote: Cc: Brian Paul Signed-off-by: Kai Wasserbäch --- Hi Brian, is this what you had in mind? If so, I was wondering whether virgl_encode.c would need to be updated as well. Doesn't seem like it, since the functions there map everything to uint32_t or

Re: [Mesa-dev] [PATCH] gallium: Use enum pipe_shader_type in bind_sampler_states()

2016-08-26 Thread Samuel Pitoiset
On 08/26/2016 04:17 PM, Kai Wasserbäch wrote: Hey Samuel, Samuel Pitoiset wrote on 26.08.2016 15:54: On 08/26/2016 01:58 PM, Kai Wasserbäch wrote: [...] diff --git a/src/gallium/drivers/nouveau/nv30/nv30_texture.c b/src/gallium/drivers/nouveau/nv30/nv30_texture.c index 4f4f87e..dc1a476

Re: [Mesa-dev] [PATCH] nv50/ir: always emit the NDV bit for OP_QUADOP

2016-08-26 Thread Samuel Pitoiset
at 12:41 PM, Samuel Pitoiset wrote: This fixes a divergent error found with F1 2015. GM107 emitter already sets that bit. Signed-off-by: Samuel Pitoiset Cc: --- src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gk110.cpp | 5 + src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp | 5 +-

[Mesa-dev] [PATCH 1/3] nvc0: make use of FAIL_SCREEN_INIT in nvc0_screen_create()

2016-08-30 Thread Samuel Pitoiset
Signed-off-by: Samuel Pitoiset --- src/gallium/drivers/nouveau/nvc0/nvc0_screen.c | 16 +++- 1 file changed, 7 insertions(+), 9 deletions(-) diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c b/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c index f139f66..b5be71a 100644

[Mesa-dev] [PATCH 2/3] nvc0: check return value of nvc0_screen_resize_tls_area()

2016-08-30 Thread Samuel Pitoiset
While we are at it, make it static and change the return values policy to be consistent. Signed-off-by: Samuel Pitoiset --- src/gallium/drivers/nouveau/nvc0/nvc0_screen.c | 16 src/gallium/drivers/nouveau/nvc0/nvc0_screen.h | 3 --- 2 files changed, 8 insertions(+), 11

[Mesa-dev] [PATCH 3/3] nvc0: fix indentation in nvc0_screen_init()

2016-08-30 Thread Samuel Pitoiset
Trivial. Signed-off-by: Samuel Pitoiset --- src/gallium/drivers/nouveau/nvc0/nvc0_screen.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c b/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c index abf2d0f..9b0f4f5 100644 --- a/src

Re: [Mesa-dev] [PATCH 2/3] nvc0: check return value of nvc0_screen_resize_tls_area()

2016-08-30 Thread Samuel Pitoiset
On 08/30/2016 04:53 PM, Ilia Mirkin wrote: On Tue, Aug 30, 2016 at 10:45 AM, Samuel Pitoiset wrote: While we are at it, make it static and change the return values policy to be consistent. Signed-off-by: Samuel Pitoiset --- src/gallium/drivers/nouveau/nvc0/nvc0_screen.c | 16

[Mesa-dev] [PATCH 2/2] nvc0: remove an attempt at uploading all IMMD into a CB

2016-08-31 Thread Samuel Pitoiset
This has never been used because info->immd.bufSize is always 0 and anyways this is an experimental code which has never been completed. This gets rid of some unused code in the program validation process. Signed-off-by: Samuel Pitoiset --- src/gallium/drivers/nouveau/nvc0/nvc0_progra

[Mesa-dev] [PATCH 1/2] nv50: remove unused nv50_program::immd_size field

2016-08-31 Thread Samuel Pitoiset
Signed-off-by: Samuel Pitoiset --- src/gallium/drivers/nouveau/nv50/nv50_program.h | 1 - 1 file changed, 1 deletion(-) diff --git a/src/gallium/drivers/nouveau/nv50/nv50_program.h b/src/gallium/drivers/nouveau/nv50/nv50_program.h index fc9ada4..009d41f 100644 --- a/src/gallium/drivers/nouveau

[Mesa-dev] [PATCH 3/6] nvc0: add nvc0_screen_resize_text_area() helper

2016-08-31 Thread Samuel Pitoiset
This function will be helpful for resizing the code segment area when we need to evict all shaders. Signed-off-by: Samuel Pitoiset --- src/gallium/drivers/nouveau/nvc0/nvc0_context.h | 1 + src/gallium/drivers/nouveau/nvc0/nvc0_screen.c | 47 +++-- src/gallium/drivers

[Mesa-dev] [PATCH 4/6] nvc0: add a new bin for the code segment

2016-08-31 Thread Samuel Pitoiset
To avoid the bins list to grow up indefinitely when the code segment size will be bumped, we need to separate that bin from the SCREEN one because it contains other resources like the uniform bo. Signed-off-by: Samuel Pitoiset --- src/gallium/drivers/nouveau/nvc0/nvc0_context.c | 4 ++-- src

[Mesa-dev] [PATCH 5/6] nvc0: allow to resize the code segment dynamically

2016-08-31 Thread Samuel Pitoiset
. The maximum size is arbitrary fixed to 8MB which should be enough. Signed-off-by: Samuel Pitoiset --- src/gallium/drivers/nouveau/nvc0/nvc0_program.c | 25 - 1 file changed, 24 insertions(+), 1 deletion(-) diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_program.c b

[Mesa-dev] [PATCH 6/6] nvc0: reduce the initial code segment size to 512KB

2016-08-31 Thread Samuel Pitoiset
Signed-off-by: Samuel Pitoiset --- src/gallium/drivers/nouveau/nvc0/nvc0_screen.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c b/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c index 6c6d177..0627f3d 100644 --- a/src/gallium

[Mesa-dev] [PATCH 1/6] nvc0: refactor the program upload process

2016-08-31 Thread Samuel Pitoiset
This refactoring will help for fixing the "out of code space" eviction issue because we will need to reupload the code for all currently bound shaders but it's slightly different than uploading a new fresh code. Signed-off-by: Samuel Pitoiset --- src/gallium/drivers/nouveau/nvc0

[Mesa-dev] [PATCH 2/6] nvc0: re-upload currently bound shaders after code eviction

2016-08-31 Thread Samuel Pitoiset
be re-uploaded and SP_START_ID have to be updated accordingly. Signed-off-by: Samuel Pitoiset --- src/gallium/drivers/nouveau/nvc0/nvc0_program.c | 27 + 1 file changed, 27 insertions(+) diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_program.c b/src/gallium/drivers

Re: [Mesa-dev] [PATCH 3/6] nvc0: add nvc0_screen_resize_text_area() helper

2016-08-31 Thread Samuel Pitoiset
On 08/31/2016 11:31 PM, Ilia Mirkin wrote: On Wed, Aug 31, 2016 at 4:52 PM, Samuel Pitoiset wrote: This function will be helpful for resizing the code segment area when we need to evict all shaders. Signed-off-by: Samuel Pitoiset --- src/gallium/drivers/nouveau/nvc0/nvc0_context.h | 1

Re: [Mesa-dev] [PATCH 3/6] nvc0: add nvc0_screen_resize_text_area() helper

2016-09-01 Thread Samuel Pitoiset
On 08/31/2016 11:36 PM, Ilia Mirkin wrote: On Wed, Aug 31, 2016 at 4:52 PM, Samuel Pitoiset wrote: This function will be helpful for resizing the code segment area when we need to evict all shaders. Signed-off-by: Samuel Pitoiset --- src/gallium/drivers/nouveau/nvc0/nvc0_context.h | 1

Re: [Mesa-dev] [PATCH 3/6] nvc0: add nvc0_screen_resize_text_area() helper

2016-09-01 Thread Samuel Pitoiset
On 09/01/2016 06:22 PM, Ilia Mirkin wrote: On Thu, Sep 1, 2016 at 12:14 PM, Samuel Pitoiset wrote: On 08/31/2016 11:36 PM, Ilia Mirkin wrote: On Wed, Aug 31, 2016 at 4:52 PM, Samuel Pitoiset wrote: This function will be helpful for resizing the code segment area when we need to evict

[Mesa-dev] [PATCH 04/11] glsl: process local_size_variable input qualifier

2016-09-08 Thread Samuel Pitoiset
This is the new layout qualifier introduced by ARB_compute_variable_group_size which allows to use a variable work group size. Signed-off-by: Samuel Pitoiset --- src/compiler/glsl/ast.h | 5 + src/compiler/glsl/ast_type.cpp | 6 ++ src/compiler/glsl

[Mesa-dev] [PATCH 00/11] add support for ARB_compute_variable_group_size

2016-09-08 Thread Samuel Pitoiset
shaders tests. Marek, Nicolai and other AMD folks, I don't know if radeonsi will need a fix somewhere for handling a variable work group size, but as I don't have the hardware, I can't test. Let me know if something needs to be slighty updated. Please review, Thanks! Samuel Pitoiset (11

[Mesa-dev] [PATCH 06/11] glsl/linker: handle errors when a variable local size is used

2016-09-08 Thread Samuel Pitoiset
Compute shaders can now include a fixed local size as defined by ARB_compute_shader or a variable size as defined by ARB_compute_variable_group_size. Signed-off-by: Samuel Pitoiset --- src/compiler/glsl/linker.cpp | 23 +-- 1 file changed, 21 insertions(+), 2 deletions

[Mesa-dev] [PATCH 08/11] st/mesa: add mapping for SYSTEM_VALUE_LOCAL_GROUP_SIZE

2016-09-08 Thread Samuel Pitoiset
gl_LocalGroupSizeARB can be translated into TGSI_SEMANTIC_BLOCK_SIZE which represents the block size in threads. Signed-off-by: Samuel Pitoiset --- src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 2 ++ 1 file changed, 2 insertions(+) diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp b/src

[Mesa-dev] [PATCH 10/11] st/mesa: expose ARB_compute_variable_group_size

2016-09-08 Thread Samuel Pitoiset
This extension is only exposed if the underlying driver supports ARB_compute_shader. Signed-off-by: Samuel Pitoiset --- src/mesa/state_tracker/st_extensions.c | 13 + 1 file changed, 13 insertions(+) diff --git a/src/mesa/state_tracker/st_extensions.c b/src/mesa/state_tracker

[Mesa-dev] [PATCH 07/11] glsl: add gl_LocalGroupSizeARB as a system value

2016-09-08 Thread Samuel Pitoiset
Signed-off-by: Samuel Pitoiset --- src/compiler/glsl/builtin_variables.cpp | 2 ++ src/compiler/shader_enums.h | 1 + 2 files changed, 3 insertions(+) diff --git a/src/compiler/glsl/builtin_variables.cpp b/src/compiler/glsl/builtin_variables.cpp index f47daab..a1768fc 100644 --- a

[Mesa-dev] [PATCH 01/11] glapi: add entry points for GL_ARB_compute_variable_group_size

2016-09-08 Thread Samuel Pitoiset
Signed-off-by: Samuel Pitoiset --- .../glapi/gen/ARB_compute_variable_group_size.xml | 25 ++ src/mapi/glapi/gen/Makefile.am | 1 + src/mapi/glapi/gen/gl_API.xml | 2 ++ src/mesa/main/compute.c| 8

[Mesa-dev] [PATCH 05/11] glsl: reject compute shaders with fixed and variable local size

2016-09-08 Thread Samuel Pitoiset
The ARB_compute_variable_group_size specification explains that when a compute shader includes both a fixed and a variable local size, a compile-time error occurs. Signed-off-by: Samuel Pitoiset --- src/compiler/glsl/ast_to_hir.cpp | 14 ++ 1 file changed, 14 insertions(+) diff

[Mesa-dev] [PATCH 02/11] mesa/main: add support for ARB_compute_variable_groups_size

2016-09-08 Thread Samuel Pitoiset
Signed-off-by: Samuel Pitoiset --- src/mesa/main/api_validate.c | 94 src/mesa/main/api_validate.h | 4 ++ src/mesa/main/compute.c | 17 src/mesa/main/context.c | 6 +++ src/mesa/main/dd.h | 9 src

[Mesa-dev] [PATCH 11/11] nv50/ir: use 1024 threads/block for variable local size

2016-09-08 Thread Samuel Pitoiset
When a variable local size is defined as specified by ARB_compute_variable_group_size, the fixed local size is set to 0 and a SIGFPE occurs when we compute the maximum number of regs. Signed-off-by: Samuel Pitoiset --- src/gallium/drivers/nouveau/codegen/nv50_ir_target.h | 3 ++- 1 file changed

[Mesa-dev] [PATCH 03/11] glsl: add enable flags for ARB_compute_variable_group_size

2016-09-08 Thread Samuel Pitoiset
This also initializes the default values for the standalone compiler. Signed-off-by: Samuel Pitoiset --- src/compiler/glsl/glsl_parser_extras.cpp | 1 + src/compiler/glsl/glsl_parser_extras.h | 2 ++ src/compiler/glsl/standalone.cpp | 4 src/compiler/glsl

[Mesa-dev] [PATCH 09/11] st/mesa: add support for dispatching a variable local size

2016-09-08 Thread Samuel Pitoiset
Signed-off-by: Samuel Pitoiset --- src/mesa/state_tracker/st_cb_compute.c | 15 --- 1 file changed, 12 insertions(+), 3 deletions(-) diff --git a/src/mesa/state_tracker/st_cb_compute.c b/src/mesa/state_tracker/st_cb_compute.c index 88c1ee2..ccc5dc2 100644 --- a/src/mesa

Re: [Mesa-dev] [PATCH 01/11] glapi: add entry points for GL_ARB_compute_variable_group_size

2016-09-09 Thread Samuel Pitoiset
On 09/08/2016 10:58 PM, Ian Romanick wrote: On 09/08/2016 01:31 PM, Samuel Pitoiset wrote: Signed-off-by: Samuel Pitoiset --- .../glapi/gen/ARB_compute_variable_group_size.xml | 25 ++ src/mapi/glapi/gen/Makefile.am | 1 + src/mapi/glapi/gen

Re: [Mesa-dev] [PATCH 02/11] mesa/main: add support for ARB_compute_variable_groups_size

2016-09-09 Thread Samuel Pitoiset
On 09/08/2016 10:58 PM, Ian Romanick wrote: On 09/08/2016 01:31 PM, Samuel Pitoiset wrote: Signed-off-by: Samuel Pitoiset --- src/mesa/main/api_validate.c | 94 src/mesa/main/api_validate.h | 4 ++ src/mesa/main/compute.c | 17

Re: [Mesa-dev] [PATCH 10/11] st/mesa: expose ARB_compute_variable_group_size

2016-09-09 Thread Samuel Pitoiset
ns and the extension can be exposed based on that CAP. Fine by me, I will add this new cap. Thanks for reviewing. Marek On Thu, Sep 8, 2016 at 10:31 PM, Samuel Pitoiset wrote: This extension is only exposed if the underlying driver supports ARB_compute_shader. Signed-off-by: Samuel Pitoiset ---

Re: [Mesa-dev] [PATCH 00/11] add support for ARB_compute_variable_group_size

2016-09-09 Thread Samuel Pitoiset
Also, the nouveau enablement patch should come _before_ the patch that turns on the extension... Good catch, thanks. :) Cheers, Nicolai On 08.09.2016 22:31, Samuel Pitoiset wrote: Hi, This series implements ARB_compute_variable_group_size written against GL 4.3. This extension allows to dis

Re: [Mesa-dev] [PATCH] radeonsi: support TGSI compute shaders with variable block size

2016-09-09 Thread Samuel Pitoiset
On 09/09/2016 10:12 AM, Nicolai Hähnle wrote: From: Nicolai Hähnle Not sure if it's possible to avoid programming the block size twice (once for the userdata and once for the dispatch). Since the shaders are compiled with a pessimistic upper limit on the number of registers, asynchronously c

Re: [Mesa-dev] [PATCH] radeonsi: support TGSI compute shaders with variable block size

2016-09-09 Thread Samuel Pitoiset
On 09/09/2016 02:37 PM, Ilia Mirkin wrote: On Fri, Sep 9, 2016 at 8:29 AM, Marek Olšák wrote: On Fri, Sep 9, 2016 at 10:12 AM, Nicolai Hähnle wrote: From: Nicolai Hähnle Not sure if it's possible to avoid programming the block size twice (once for the userdata and once for the dispatch).

Re: [Mesa-dev] [PATCH 01/11] glapi: add entry points for GL_ARB_compute_variable_group_size

2016-09-09 Thread Samuel Pitoiset
On 09/09/2016 06:31 PM, Marek Olšák wrote: On Fri, Sep 9, 2016 at 5:46 PM, Samuel Pitoiset wrote: On 09/08/2016 10:58 PM, Ian Romanick wrote: On 09/08/2016 01:31 PM, Samuel Pitoiset wrote: Signed-off-by: Samuel Pitoiset --- .../glapi/gen/ARB_compute_variable_group_size.xml | 25

Re: [Mesa-dev] [PATCH 01/11] glapi: add entry points for GL_ARB_compute_variable_group_size

2016-09-10 Thread Samuel Pitoiset
On 09/09/2016 08:22 PM, Ian Romanick wrote: On 09/09/2016 08:46 AM, Samuel Pitoiset wrote: On 09/08/2016 10:58 PM, Ian Romanick wrote: On 09/08/2016 01:31 PM, Samuel Pitoiset wrote: Signed-off-by: Samuel Pitoiset --- .../glapi/gen/ARB_compute_variable_group_size.xml | 25

Re: [Mesa-dev] [PATCH 05/11] glsl: reject compute shaders with fixed and variable local size

2016-09-10 Thread Samuel Pitoiset
On 09/09/2016 08:46 PM, Ian Romanick wrote: On 09/08/2016 01:31 PM, Samuel Pitoiset wrote: The ARB_compute_variable_group_size specification explains that when a compute shader includes both a fixed and a variable local size, a compile-time error occurs. I probably would have squashed this

Re: [Mesa-dev] [PATCH 07/11] glsl: add gl_LocalGroupSizeARB as a system value

2016-09-10 Thread Samuel Pitoiset
On 09/09/2016 08:50 PM, Ilia Mirkin wrote: On Thu, Sep 8, 2016 at 4:31 PM, Samuel Pitoiset wrote: Signed-off-by: Samuel Pitoiset --- src/compiler/glsl/builtin_variables.cpp | 2 ++ src/compiler/shader_enums.h | 1 + 2 files changed, 3 insertions(+) diff --git a/src/compiler

[Mesa-dev] [PATCH] tgsi: document semantics for compute shaders

2016-09-10 Thread Samuel Pitoiset
Cc: Nicolai Hähnle Signed-off-by: Samuel Pitoiset --- src/gallium/docs/source/tgsi.rst | 26 ++ 1 file changed, 26 insertions(+) diff --git a/src/gallium/docs/source/tgsi.rst b/src/gallium/docs/source/tgsi.rst index c6e5ceb..d83cf9a 100644 --- a/src/gallium/docs/source

Re: [Mesa-dev] [PATCH] tgsi: document semantics for compute shaders

2016-09-10 Thread Samuel Pitoiset
On 09/10/2016 05:01 PM, Ilia Mirkin wrote: On Sat, Sep 10, 2016 at 10:05 AM, Samuel Pitoiset wrote: Cc: Nicolai Hähnle Signed-off-by: Samuel Pitoiset --- src/gallium/docs/source/tgsi.rst | 26 ++ 1 file changed, 26 insertions(+) diff --git a/src/gallium/docs

[Mesa-dev] [PATCH v2] tgsi: document semantics for compute shaders

2016-09-10 Thread Samuel Pitoiset
Cc: Nicolai Hähnle Signed-off-by: Samuel Pitoiset --- src/gallium/docs/source/tgsi.rst | 28 1 file changed, 28 insertions(+) diff --git a/src/gallium/docs/source/tgsi.rst b/src/gallium/docs/source/tgsi.rst index c6e5ceb..881aef6 100644 --- a/src/gallium/docs

Re: [Mesa-dev] [PATCH v2 1/2] gm107/ir: AL2P writes to a predicate register

2016-09-10 Thread Samuel Pitoiset
This series is: Reviewed-by: Samuel Pitoiset On 09/10/2016 06:58 PM, Ilia Mirkin wrote: We have to force it to write to predicate 7 (aka PT) in order for it not to mess up another predicate. Unclear what would be returned in the predicate, perhaps an error code for out-of-bounds requests

Re: [Mesa-dev] [PATCH 02/11] mesa/main: add support for ARB_compute_variable_groups_size

2016-09-11 Thread Samuel Pitoiset
On 09/08/2016 10:58 PM, Ian Romanick wrote: On 09/08/2016 01:31 PM, Samuel Pitoiset wrote: Signed-off-by: Samuel Pitoiset --- src/mesa/main/api_validate.c | 94 src/mesa/main/api_validate.h | 4 ++ src/mesa/main/compute.c | 17

[Mesa-dev] [PATCH v2 03/14] glsl: add enable flags for ARB_compute_variable_group_size

2016-09-11 Thread Samuel Pitoiset
This also initializes the default values for the standalone compiler. Signed-off-by: Samuel Pitoiset Reviewed-by: Ian Romanick --- src/compiler/glsl/glsl_parser_extras.cpp | 1 + src/compiler/glsl/glsl_parser_extras.h | 2 ++ src/compiler/glsl/standalone.cpp | 4 src

[Mesa-dev] [PATCH v2 06/14] glsl/linker: handle errors when a variable local size is used

2016-09-11 Thread Samuel Pitoiset
Compute shaders can now include a fixed local size as defined by ARB_compute_shader or a variable size as defined by ARB_compute_variable_group_size. v2: - update formatting spec quotations (Ian) - various cosmetic changes (Ian) Signed-off-by: Samuel Pitoiset Reviewed-by: Ian Romanick

[Mesa-dev] [PATCH v2 07/14] glsl: add gl_LocalGroupSizeARB as a system value

2016-09-11 Thread Samuel Pitoiset
v2: - only add it if the ext is enabled (Ilia) Signed-off-by: Samuel Pitoiset Reviewed-by: Ian Romanick --- src/compiler/glsl/builtin_variables.cpp | 6 ++ src/compiler/shader_enums.h | 1 + 2 files changed, 7 insertions(+) diff --git a/src/compiler/glsl/builtin_variables.cpp

[Mesa-dev] [PATCH v2 04/14] glsl: process local_size_variable input qualifier

2016-09-11 Thread Samuel Pitoiset
This is the new layout qualifier introduced by ARB_compute_variable_group_size which allows to use a variable work group size. Signed-off-by: Samuel Pitoiset Reviewed-by: Ian Romanick --- src/compiler/glsl/ast.h | 5 + src/compiler/glsl/ast_type.cpp | 6

[Mesa-dev] [PATCH v2 01/14] glapi: add entry points for GL_ARB_compute_variable_group_size

2016-09-11 Thread Samuel Pitoiset
v2: - correctly sort that new extension (Ian) - fix up the comment (Ian) Signed-off-by: Samuel Pitoiset Reviewed-by: Ian Romanick --- .../glapi/gen/ARB_compute_variable_group_size.xml | 25 ++ src/mapi/glapi/gen/Makefile.am | 1 + src/mapi/glapi

[Mesa-dev] [PATCH v2 00/14] add support for ARB_compute_variable_group_size

2016-09-11 Thread Samuel Pitoiset
eonsi will need a fix somewhere for handling a variable work group size, but as I don't have the hardware, I can't test. Let me know if something needs to be slighty updated. Please review, Thanks! Samuel Pitoiset (14): glapi: add entry points for GL_ARB_compute_variable_group_size

[Mesa-dev] [PATCH v2 05/14] glsl: reject compute shaders with fixed and variable local size

2016-09-11 Thread Samuel Pitoiset
The ARB_compute_variable_group_size specification explains that when a compute shader includes both a fixed and a variable local size, a compile-time error occurs. v2: - update formatting spec quotations (Ian) Signed-off-by: Samuel Pitoiset --- src/compiler/glsl/ast_to_hir.cpp | 14

[Mesa-dev] [PATCH v2 11/14] st/mesa: expose ARB_compute_variable_group_size

2016-09-11 Thread Samuel Pitoiset
This extension is only exposed if the underlying driver supports ARB_compute_shader and if PIPE_COMPUTE_MAX_VARIABLE_THREADS_PER_BLOCK is set. v2: - expose the ext based on that new cap Signed-off-by: Samuel Pitoiset --- src/mesa/state_tracker/st_extensions.c | 22 ++ 1

[Mesa-dev] [PATCH v2 02/14] mesa/main: add support for ARB_compute_variable_groups_size

2016-09-11 Thread Samuel Pitoiset
v2: - update formatting spec quotations (Ian) - move the total_invocations check outside of the loop (Ian) Signed-off-by: Samuel Pitoiset --- src/mesa/main/api_validate.c | 96 src/mesa/main/api_validate.h | 4 ++ src/mesa/main/compute.c

[Mesa-dev] [PATCH v2 09/14] st/mesa: add mapping for SYSTEM_VALUE_LOCAL_GROUP_SIZE

2016-09-11 Thread Samuel Pitoiset
gl_LocalGroupSizeARB can be translated into TGSI_SEMANTIC_BLOCK_SIZE which represents the block size in threads. Signed-off-by: Samuel Pitoiset Reviewed-by: Marek Olšák --- src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 2 ++ 1 file changed, 2 insertions(+) diff --git a/src/mesa/state_tracker

[Mesa-dev] [PATCH v2 13/14] nvc0: expose ARB_compute_variable_group_size

2016-09-11 Thread Samuel Pitoiset
Let's return the same number of threads per block for both fixed and variable sizes. Signed-off-by: Samuel Pitoiset --- src/gallium/drivers/nouveau/nvc0/nvc0_screen.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c

[Mesa-dev] [PATCH v2 08/14] gallium: add PIPE_COMPUTE_CAP_MAX_VARIABLE_THREADS_PER_BLOCK

2016-09-11 Thread Samuel Pitoiset
Signed-off-by: Samuel Pitoiset --- src/gallium/docs/source/screen.rst | 4 src/gallium/drivers/ilo/ilo_screen.c | 2 ++ src/gallium/drivers/nouveau/nv50/nv50_screen.c | 2 ++ src/gallium/drivers/nouveau/nvc0/nvc0_screen.c | 2 ++ src/gallium/drivers/radeon

[Mesa-dev] [PATCH v2 14/14] docs: mark ARB_compute_variable_group_size as done for nvc0

2016-09-11 Thread Samuel Pitoiset
Signed-off-by: Samuel Pitoiset --- docs/features.txt | 2 +- docs/relnotes/12.1.0.html | 1 + 2 files changed, 2 insertions(+), 1 deletion(-) diff --git a/docs/features.txt b/docs/features.txt index 690c160..3825943 100644 --- a/docs/features.txt +++ b/docs/features.txt @@ -279,7 +279,7

[Mesa-dev] [PATCH v2 10/14] st/mesa: add support for dispatching a variable local size

2016-09-11 Thread Samuel Pitoiset
Signed-off-by: Samuel Pitoiset Reviewed-by: Marek Olšák --- src/mesa/state_tracker/st_cb_compute.c | 15 --- 1 file changed, 12 insertions(+), 3 deletions(-) diff --git a/src/mesa/state_tracker/st_cb_compute.c b/src/mesa/state_tracker/st_cb_compute.c index 88c1ee2..ccc5dc2 100644

[Mesa-dev] [PATCH v2 12/14] nv50/ir: use 1024 threads/block for variable local size

2016-09-11 Thread Samuel Pitoiset
When a variable local size is defined as specified by ARB_compute_variable_group_size, the fixed local size is set to 0 and a SIGFPE occurs when we compute the maximum number of regs. Signed-off-by: Samuel Pitoiset --- src/gallium/drivers/nouveau/codegen/nv50_ir_target.h | 3 ++- 1 file changed

Re: [Mesa-dev] [PATCH v2 08/14] gallium: add PIPE_COMPUTE_CAP_MAX_VARIABLE_THREADS_PER_BLOCK

2016-09-12 Thread Samuel Pitoiset
On 09/12/2016 05:35 PM, Nicolai Hähnle wrote: On 11.09.2016 20:45, Samuel Pitoiset wrote: Signed-off-by: Samuel Pitoiset --- src/gallium/docs/source/screen.rst | 4 src/gallium/drivers/ilo/ilo_screen.c | 2 ++ src/gallium/drivers/nouveau/nv50/nv50_screen.c | 2

<    1   2   3   4   5   6   7   8   9   10   >