Re: [Mesa-dev] [PATCH v2 08/14] gallium: add PIPE_COMPUTE_CAP_MAX_VARIABLE_THREADS_PER_BLOCK

2016-09-12 Thread Samuel Pitoiset
On 09/12/2016 05:19 PM, Nicolai Hähnle wrote: On 11.09.2016 20:45, Samuel Pitoiset wrote: Signed-off-by: Samuel Pitoiset --- src/gallium/docs/source/screen.rst | 4 src/gallium/drivers/ilo/ilo_screen.c | 2 ++ src/gallium/drivers/nouveau/nv50/nv50_screen.c | 2

Re: [Mesa-dev] [PATCH v2 11/14] st/mesa: expose ARB_compute_variable_group_size

2016-09-12 Thread Samuel Pitoiset
On 09/12/2016 05:26 PM, Nicolai Hähnle wrote: On 11.09.2016 20:45, Samuel Pitoiset wrote: This extension is only exposed if the underlying driver supports ARB_compute_shader and if PIPE_COMPUTE_MAX_VARIABLE_THREADS_PER_BLOCK is set. v2: - expose the ext based on that new cap Signed-off-by

[Mesa-dev] [PATCH shader-db] add a new option for selecting the render node ID

2016-09-12 Thread Samuel Pitoiset
When multiple GPUs are plugged in the same box, we might want to use /dev/dri/renderD129 without updating/compiling the code. This doesn't change the existing behaviour. --- run.c | 23 +++ 1 file changed, 19 insertions(+), 4 deletions(-) diff --git a/run.c b/run.c index c7f0b

Re: [Mesa-dev] [PATCH] gm107/ir: fix texturing with indirect samplers

2016-10-18 Thread Samuel Pitoiset
On 10/18/2016 05:53 AM, Ilia Mirkin wrote: The indirect handle has to come right after the coordinates, so if there was a sample/bias/depth compare/offset, everything would end up being shifted by one argument position. Signed-off-by: Ilia Mirkin Cc: mesa-sta...@lists.freedesktop.org --- src

Re: [Mesa-dev] [PATCH] gm107/ir: fix bit offset of tex lod setting for indirect texturing

2016-10-18 Thread Samuel Pitoiset
->tex.rIndirectSrc >= 0) { emitInsn (0xdeb8); - emitField(0x35, 2, lodm); + emitField(0x25, 2, lodm); The length should be 3, but as we don't use lba/lla, it's fine. Reviewed-by: Samuel Pitoiset emitField(0x24, 1, insn->tex.useOffsets == 1); } els

Re: [Mesa-dev] [PATCH] gm107/ir: fix texturing with indirect samplers

2016-10-18 Thread Samuel Pitoiset
On 10/18/2016 10:50 AM, Samuel Pitoiset wrote: On 10/18/2016 05:53 AM, Ilia Mirkin wrote: The indirect handle has to come right after the coordinates, so if there was a sample/bias/depth compare/offset, everything would end up being shifted by one argument position. Signed-off-by: Ilia

Re: [Mesa-dev] [PATCH] gm107/ir: fix texturing with indirect samplers

2016-10-18 Thread Samuel Pitoiset
On 10/18/2016 12:33 PM, Samuel Pitoiset wrote: On 10/18/2016 10:50 AM, Samuel Pitoiset wrote: On 10/18/2016 05:53 AM, Ilia Mirkin wrote: The indirect handle has to come right after the coordinates, so if there was a sample/bias/depth compare/offset, everything would end up being shifted

[Mesa-dev] [PATCH] nv50/ir: silent TGSI_PROPERTY_FS_DEPTH_LAYOUT

2016-10-18 Thread Samuel Pitoiset
Found that information message while replaying a trace from Metro 2033 Redux. Mark that property as useless for now. Signed-off-by: Samuel Pitoiset --- src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp | 1 + 1 file changed, 1 insertion(+) diff --git a/src/gallium/drivers/nouveau

Re: [Mesa-dev] [PATCH] mesa: remove unused LocalSizeVariable

2016-10-19 Thread Samuel Pitoiset
Ah, I probably forgot to remove it in the later revisions of my ARB_compute_variable_group_size series. Thanks. Reviewed-by: Samuel Pitoiset On 10/19/2016 01:51 AM, Timothy Arceri wrote: Cc: Samuel Pitoiset Cc: Kenneth Graunke --- src/mesa/main/mtypes.h| 5 - src/mesa/main

Re: [Mesa-dev] [PATCH] nv50, nvc0: avoid reading out of bounds when getting bogus so info

2016-10-19 Thread Samuel Pitoiset
radeonsi does the same check, seems like correct. How did you catch this? Does this fix a CTS test or something else? Reviewed-by: Samuel Pitoiset On 10/19/2016 06:08 AM, Ilia Mirkin wrote: The state tracker tries to attach the info to the wrong shader. This is easy enough to protect against

Re: [Mesa-dev] [PATCH] nv50/ir: process texture offset sources as regular sources

2016-10-19 Thread Samuel Pitoiset
I'm aware of that CTS fail, and according to the GLSL shader this makes sense. I would like to run a quick piglit before you push it, but I'm confident with that change. :-) Reviewed-by: Samuel Pitoiset On 10/19/2016 07:22 AM, Ilia Mirkin wrote: With ARB_gpu_shader5, texture o

[Mesa-dev] [PATCH 1/3] nv50/ir: print CCTL subops in debug mode

2016-10-19 Thread Samuel Pitoiset
Signed-off-by: Samuel Pitoiset --- src/gallium/drivers/nouveau/codegen/nv50_ir_print.cpp | 9 + 1 file changed, 9 insertions(+) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_print.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_print.cpp index dbd0f7d..0c143e5 100644 --- a

[Mesa-dev] [PATCH 2/3] nvc0/ir: remove useless NVC0LoweringPass::gMemBase

2016-10-19 Thread Samuel Pitoiset
Signed-off-by: Samuel Pitoiset --- src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp | 5 + 1 file changed, 1 insertion(+), 4 deletions(-) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp

[Mesa-dev] [PATCH 3/3] nvc0/ir: simplify predicate logic for GK104 atomic operations

2016-10-19 Thread Samuel Pitoiset
The predicate is always CC_NOT_P as defined in processSurfaceCoordsNVE4(), so we only want to emit OR. Signed-off-by: Samuel Pitoiset --- .../nouveau/codegen/nv50_ir_lowering_nvc0.cpp| 20 ++-- 1 file changed, 6 insertions(+), 14 deletions(-) diff --git a/src/gallium

Re: [Mesa-dev] [PATCH 2/3] nvc0/ir: remove useless NVC0LoweringPass::gMemBase

2016-10-19 Thread Samuel Pitoiset
, Oct 19, 2016 at 5:21 PM, Samuel Pitoiset wrote: Signed-off-by: Samuel Pitoiset --- src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp | 5 + 1 file changed, 1 insertion(+), 4 deletions(-) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp b/src/ga

Re: [Mesa-dev] [PATCH 2/3] nvc0/ir: remove useless NVC0LoweringPass::gMemBase

2016-10-19 Thread Samuel Pitoiset
On 10/19/2016 11:41 PM, Ilia Mirkin wrote: On Wed, Oct 19, 2016 at 5:33 PM, Samuel Pitoiset wrote: On 10/19/2016 11:29 PM, Ilia Mirkin wrote: It avoids creating a ton of symbols unnecessarily during the lifetime of the pass. Does it hurt anything? I think either we use that symbol

[Mesa-dev] [PATCH] nvc0: do not break 3D state by pushing MS coordinates on Fermi

2016-10-19 Thread Samuel Pitoiset
be to use two different channels, one for 3D and one for CP. This fixes a bunch of regressions pinpointed by piglit. Fixes: "nvc0: fix up image support for allowing multiple samples" Cc: "12.0" Signed-off-by: Samuel Pitoiset --- This will require different piglit runs on bot

Re: [Mesa-dev] [PATCH] nvc0: do not break 3D state by pushing MS coordinates on Fermi

2016-10-19 Thread Samuel Pitoiset
On 10/20/2016 12:55 AM, Ilia Mirkin wrote: On Wed, Oct 19, 2016 at 6:46 PM, Samuel Pitoiset wrote: Long short story, 3D and CP are aliased on Fermi and initializing compute after pushing the MS sample coordinate offsets seems to corrupt 3D state for weird reasons. I still don't hav

Re: [Mesa-dev] [PATCH] nvc0: do not break 3D state by pushing MS coordinates on Fermi

2016-10-20 Thread Samuel Pitoiset
On 10/20/2016 12:46 AM, Samuel Pitoiset wrote: Long short story, 3D and CP are aliased on Fermi and initializing compute after pushing the MS sample coordinate offsets seems to corrupt 3D state for weird reasons. I still don't have the faintest clue what is going on, but this seems to

Re: [Mesa-dev] [PATCH v2] nv50, nvc0: don't keep track of whether fb rt0 is integer-only

2016-10-20 Thread Samuel Pitoiset
Reviewed-by: Samuel Pitoiset On 10/20/2016 04:44 AM, Ilia Mirkin wrote: This reverts commits 1af0641db345209c076e9b1ba4dca7524541671a and a6ad49cbbd599aec054d0a3163fff5ad724f2b18. st/mesa adjusts the rasterizer state for us now. Signed-off-by: Ilia Mirkin --- v1 -> v2: also revert the n

[Mesa-dev] [PATCH] nvc0: translate compute shaders at program creation

2016-10-20 Thread Samuel Pitoiset
This makes shader-db reports results for compute shaders. Signed-off-by: Samuel Pitoiset --- src/gallium/drivers/nouveau/nvc0/nvc0_state.c | 4 1 file changed, 4 insertions(+) diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_state.c b/src/gallium/drivers/nouveau/nvc0/nvc0_state.c index

Re: [Mesa-dev] [PATCH] nvc0: translate compute shaders at program creation

2016-10-20 Thread Samuel Pitoiset
On Thu, Oct 20, 2016 at 12:08 PM, Samuel Pitoiset wrote: This makes shader-db reports results for compute shaders. Signed-off-by: Samuel Pitoiset --- src/gallium/drivers/nouveau/nvc0/nvc0_state.c | 4 1 file changed, 4 insertions(+) diff --git a/src/gallium/drivers/nouveau/nvc0

Re: [Mesa-dev] [PATCH 1/5] nv50/ir: use levelZero for non-frag tex/txp ops

2016-10-21 Thread Samuel Pitoiset
Reviewed-by: Samuel Pitoiset On 10/21/2016 08:30 AM, Ilia Mirkin wrote: radeonsi also does the same thing. I suspect that this is likely to be a no-op in reality, but it brings nouveau code closer to what the blob produces. Plus it makes sense to not try to do auto-derivatives on this. Signed

Re: [Mesa-dev] [PATCH 2/5] nvc0/ir: use levelZero flag when the lod is set to 0

2016-10-21 Thread Samuel Pitoiset
Reviewed-by: Samuel Pitoiset On 10/21/2016 08:30 AM, Ilia Mirkin wrote: Signed-off-by: Ilia Mirkin --- src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp | 9 + 1 file changed, 9 insertions(+) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp b

Re: [Mesa-dev] [PATCH 3/5] nv50/ir: it appears that OP_DISCARD can't take a join modifier

2016-10-21 Thread Samuel Pitoiset
Reviewed-by: Samuel Pitoiset On 10/21/2016 08:30 AM, Ilia Mirkin wrote: nvdisasm does not print a .S even though the bit is set. Signed-off-by: Ilia Mirkin --- src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp | 1 + 1 file changed, 1 insertion(+) diff --git a/src/gallium/drivers

Re: [Mesa-dev] [PATCH 3/5] nv50/ir: it appears that OP_DISCARD can't take a join modifier

2016-10-21 Thread Samuel Pitoiset
On 10/21/2016 11:18 AM, Samuel Pitoiset wrote: Reviewed-by: Samuel Pitoiset On 10/21/2016 08:30 AM, Ilia Mirkin wrote: nvdisasm does not print a .S even though the bit is set. Signed-off-by: Ilia Mirkin --- src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp | 1 + 1 file changed, 1

Re: [Mesa-dev] [PATCH 2/5] nvc0/ir: use levelZero flag when the lod is set to 0

2016-10-21 Thread Samuel Pitoiset
This patch breaks a bunch of piglit tests, see a short list below: bin/arb_texture_barrier-blending-in-shader 512 42 1 128 7 -auto -fbo bin/arb_texture_buffer_object-formats vs core -auto -fbo bin/texelFetch 140 vs sampler2DRect -auto -fbo bin/mesa_pack_invert-readpixels -auto -fbo ... Around 15

[Mesa-dev] [PATCH] nvc0/ir: fix emission of SHLADD with NEG modifiers

2016-10-21 Thread Samuel Pitoiset
This affects GF100:GK110 chipsets, but not GM107+ where the logic is a bit different. The emitters tried to emit sub instead of subr when src0 has a NEG modifier. This fixes the following piglit tests glsl-fs-loop-nested and glsl-vs-loop-nested. Signed-off-by: Samuel Pitoiset Cc: "

[Mesa-dev] [PATCH] nvc0/ir: remove outdated comment about SHLADD

2016-10-21 Thread Samuel Pitoiset
Signed-off-by: Samuel Pitoiset --- src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gk110.cpp | 1 - src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp | 1 - 2 files changed, 2 deletions(-) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gk110.cpp b/src/gallium/drivers

[Mesa-dev] [PATCH] nv50/ir: do not perform global membar for shared memory

2016-10-24 Thread Samuel Pitoiset
Shared memory is local to CTA, thus we should only wait for prior memory writes which are visible to other threads in the same CTA, and not at global level. This should speedup compute shaders which use shared memory. Signed-off-by: Samuel Pitoiset --- src/gallium/drivers/nouveau/codegen

Re: [Mesa-dev] [PATCH] nv50/ir: do not perform global membar for shared memory

2016-10-24 Thread Samuel Pitoiset
On 10/24/2016 04:35 PM, Ilia Mirkin wrote: On Mon, Oct 24, 2016 at 10:29 AM, Samuel Pitoiset wrote: Shared memory is local to CTA, thus we should only wait for prior memory writes which are visible to other threads in the same CTA, and not at global level. This should speedup compute shaders

[Mesa-dev] [PATCH] nv50/ir: display OP_BAR subops in debug mode

2016-10-24 Thread Samuel Pitoiset
Signed-off-by: Samuel Pitoiset --- src/gallium/drivers/nouveau/codegen/nv50_ir_print.cpp | 9 + 1 file changed, 9 insertions(+) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_print.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_print.cpp index 0c143e5..78c0757 100644 --- a

[Mesa-dev] [PATCH v2] nv50/ir: do not perform global membar for shared memory

2016-10-24 Thread Samuel Pitoiset
Shared memory is local to CTA, thus we should only wait for prior memory writes which are visible to other threads in the same CTA, and not at global level. This should speedup compute shaders which use shared memory. v2: - do not use == Signed-off-by: Samuel Pitoiset --- src/gallium/drivers

Re: [Mesa-dev] [PATCH] nv50/ir: start LocalCSE with getFirst to merge PHI instructions

2016-10-25 Thread Samuel Pitoiset
On 10/08/2016 06:58 PM, Samuel Pitoiset wrote: This breaks a bunch of things, like: spec/glsl-4.30/execution/built-in-functions/cs-all-bvec2-using-if: fail spec/glsl-4.30/execution/built-in-functions/cs-all-bvec3-using-if: fail spec/glsl-4.30/execution/built-in-functions/cs-all-bvec4-using-if

[Mesa-dev] [PATCH] nvc0: use correct bufctx when invalidating CP textures

2016-10-25 Thread Samuel Pitoiset
Signed-off-by: Samuel Pitoiset Cc: "12.0 13.0" --- src/gallium/drivers/nouveau/nvc0/nvc0_tex.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_tex.c b/src/gallium/drivers/nouveau/nvc0/nvc0_tex.c index cbc270d..e57391e 100644

[Mesa-dev] [PATCH 0/7] nvc0: various perf improvements for Elemental

2016-10-25 Thread Samuel Pitoiset
e and 3d instead of writing compiler optimizations. I didn't see any regressions with full piglit on GK110, but I will launch a new one on Fermi. Please review, Thanks! Samuel Pitoiset (7): nvc0: reduce the number of PUSH_SPACE in draw path nvc0: only update primitive restart for ind

[Mesa-dev] [PATCH 3/7] nvc0: simplify draw parameters upload for vertex shaders

2016-10-25 Thread Samuel Pitoiset
Signed-off-by: Samuel Pitoiset --- src/gallium/drivers/nouveau/nvc0/nvc0_vbo.c | 14 ++ 1 file changed, 6 insertions(+), 8 deletions(-) diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_vbo.c b/src/gallium/drivers/nouveau/nvc0/nvc0_vbo.c index 138e24d..11fd7eb 100644 --- a/src

[Mesa-dev] [PATCH 1/7] nvc0: reduce the number of PUSH_SPACE in draw path

2016-10-25 Thread Samuel Pitoiset
This might help CPU-bounds applications but should not have any real effects for GPU-bounds ones. Signed-off-by: Samuel Pitoiset --- src/gallium/drivers/nouveau/nvc0/nvc0_vbo.c | 8 +++- 1 file changed, 3 insertions(+), 5 deletions(-) diff --git a/src/gallium/drivers/nouveau/nvc0

[Mesa-dev] [PATCH 6/7] nvc0: only invalidate currently bound tic/tsc

2016-10-25 Thread Samuel Pitoiset
This is especially useful when switching from compute to 3D. Signed-off-by: Samuel Pitoiset --- src/gallium/drivers/nouveau/nvc0/nvc0_compute.c | 11 +++ src/gallium/drivers/nouveau/nvc0/nvc0_tex.c | 14 ++ src/gallium/drivers/nouveau/nvc0/nve4_compute.c | 11

[Mesa-dev] [PATCH 2/7] nvc0: only update primitive restart for indexed draws

2016-10-25 Thread Samuel Pitoiset
Unnecessary to update it at every draw calls, especially for non-indexed draws. This is similar to what nv50 already does. Signed-off-by: Samuel Pitoiset --- src/gallium/drivers/nouveau/nvc0/nvc0_vbo.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/src/gallium/drivers

[Mesa-dev] [PATCH 7/7] nvc0: add support for PIPE_CAP_SURFACE_REINTERPRET_BLOCKS

2016-10-25 Thread Samuel Pitoiset
Loosely based on radeonsi, thanks Nicolai! Signed-off-by: Samuel Pitoiset --- src/gallium/drivers/nouveau/nv50/nv50_miptree.c | 13 src/gallium/drivers/nouveau/nv50/nv50_resource.h | 3 ++- src/gallium/drivers/nouveau/nvc0/nvc0_miptree.c | 6 -- src/gallium/drivers/nouveau

[Mesa-dev] [PATCH 5/7] nvc0: clean nve4_compute_validate_textures()

2016-10-25 Thread Samuel Pitoiset
It's not particularily useful to store commands which are going to be send few lines after. Signed-off-by: Samuel Pitoiset --- src/gallium/drivers/nouveau/nvc0/nve4_compute.c | 17 - 1 file changed, 4 insertions(+), 13 deletions(-) diff --git a/src/gallium/drivers/nouveau

[Mesa-dev] [PATCH 4/7] nvc0: be smarter when invalidating shader caches

2016-10-25 Thread Samuel Pitoiset
MEM_BARRIER seems to be similar to FLUSH, thus bit 0 is for flushing code while bit 12 is for constant buffers. Signed-off-by: Samuel Pitoiset --- src/gallium/drivers/nouveau/nvc0/nvc0_program.c | 2 +- src/gallium/drivers/nouveau/nvc0/nvc0_vbo.c | 2 +- 2 files changed, 2 insertions(+), 2

Re: [Mesa-dev] [PATCH 1/7] nvc0: reduce the number of PUSH_SPACE in draw path

2016-10-26 Thread Samuel Pitoiset
On 10/25/2016 09:49 PM, Ilia Mirkin wrote: What if instance_count = 1M? (It can happen.) We allocate a giant space in the pushbuf in one shot. Well, anyways this is not the optimization of the year, so I can drop it. :-) On Tue, Oct 25, 2016 at 3:41 PM, Samuel Pitoiset wrote: This

Re: [Mesa-dev] [PATCH 4/7] nvc0: be smarter when invalidating shader caches

2016-10-26 Thread Samuel Pitoiset
% sure it's correct (and blob doesn't always use 0x1011). On Tue, Oct 25, 2016 at 3:41 PM, Samuel Pitoiset wrote: MEM_BARRIER seems to be similar to FLUSH, thus bit 0 is for flushing code while bit 12 is for constant buffers. Signed-off-by: Samuel Pitoiset --- src/gallium/driv

Re: [Mesa-dev] [PATCH 6/7] nvc0: only invalidate currently bound tic/tsc

2016-10-26 Thread Samuel Pitoiset
On 10/25/2016 09:59 PM, Ilia Mirkin wrote: On Tue, Oct 25, 2016 at 3:41 PM, Samuel Pitoiset wrote: This is especially useful when switching from compute to 3D. Signed-off-by: Samuel Pitoiset --- src/gallium/drivers/nouveau/nvc0/nvc0_compute.c | 11 +++ src/gallium/drivers/nouveau

Re: [Mesa-dev] [PATCH 5/7] nvc0: clean nve4_compute_validate_textures()

2016-10-26 Thread Samuel Pitoiset
On 10/25/2016 09:57 PM, Ilia Mirkin wrote: It's useful because it lets you avoid having to send a bunch of begins. NAK. Well okay, we should probably adds more consistency then, because it's the only place where we do something like that. On Tue, Oct 25, 2016 at 3:41 PM, Samue

Re: [Mesa-dev] [PATCH v2 1/6] gk110/ir: add LIMM form of mad

2016-10-26 Thread Samuel Pitoiset
You forgot to add emission for the CC flag, ie: if (i->flagsDef >= 0) code[1] |= 1 << 23; Few other comments below. On 10/09/2016 11:04 AM, Karol Herbst wrote: v2: renamed commit reordered modifiers add assert(dst == src2) Signed-off-by: Karol Herbst --- .../drivers/nouveau/codege

Re: [Mesa-dev] [PATCH v2 2/6] gm107/ir: add LIMM form of mad

2016-10-26 Thread Samuel Pitoiset
On 10/09/2016 11:04 AM, Karol Herbst wrote: v2: renamed commit reordered modifiers add assert(dst == src2) Signed-off-by: Karol Herbst --- .../drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp | 35 -- 1 file changed, 26 insertions(+), 9 deletions(-) diff --git a/sr

Re: [Mesa-dev] [PATCH v2 3/6] nv50/ir: replace post_ra_dead by Instruction::isDead

2016-10-26 Thread Samuel Pitoiset
I'm definitely in favour of my first solution, ie.: if (postRA) return post_ra_dead(this); On 10/09/2016 11:04 AM, Karol Herbst wrote: Signed-off-by: Karol Herbst --- src/gallium/drivers/nouveau/codegen/nv50_ir.h| 2 +- .../drivers/nouveau/codegen/nv50_ir_peephole.cpp | 20

[Mesa-dev] [PATCH 3/3] nvc0: refactor textures/samplers validation

2016-10-26 Thread Samuel Pitoiset
e any improvements with Elemental but this might help in some cases. Signed-off-by: Samuel Pitoiset --- src/gallium/drivers/nouveau/nvc0/nvc0_compute.c | 12 +- src/gallium/drivers/nouveau/nvc0/nvc0_context.h | 7 +- src/gallium/drivers/nouveau/nvc0/nvc0_tex.c | 159 ++--

[Mesa-dev] [PATCH 2/3] nvc0: add nvc0_screen_{tic,tsc}_lock()

2016-10-26 Thread Samuel Pitoiset
Signed-off-by: Samuel Pitoiset --- src/gallium/drivers/nouveau/nvc0/nvc0_screen.h | 14 ++ src/gallium/drivers/nouveau/nvc0/nvc0_tex.c | 10 +- src/gallium/drivers/nouveau/nvc0/nve4_compute.c | 4 ++-- 3 files changed, 21 insertions(+), 7 deletions(-) diff --git a/src

[Mesa-dev] [PATCH 1/3] nvc0: more use of nve4_p2mf_push_linear() in compute path

2016-10-26 Thread Samuel Pitoiset
Signed-off-by: Samuel Pitoiset --- src/gallium/drivers/nouveau/nvc0/nve4_compute.c | 72 ++--- 1 file changed, 16 insertions(+), 56 deletions(-) diff --git a/src/gallium/drivers/nouveau/nvc0/nve4_compute.c b/src/gallium/drivers/nouveau/nvc0/nve4_compute.c index d661c00

[Mesa-dev] [PATCH] nvc0/ir: fix emission of IMAD with NEG modifiers

2016-10-26 Thread Samuel Pitoiset
The emitter tried to emit sub instead of subr when src0 has actually a NEG modifier. Signed-off-by: Samuel Pitoiset Cc: "11.0 12.0 13.0" --- src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gk110.cpp | 2 +- src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp | 2 +- 2 files

[Mesa-dev] [PATCH] nv50, nvc0: stop limiting the number of active queries to 1

2016-10-26 Thread Samuel Pitoiset
nitor more queries than supported. This breaks amd_performance_monitor_measure but it's expected. Signed-off-by: Samuel Pitoiset --- src/gallium/drivers/nouveau/nv50/nv50_query.c | 14 ++ src/gallium/drivers/nouveau/nvc0/nvc0_query.c | 14 ++ 2 files changed, 12

Re: [Mesa-dev] [PATCH v2 4/6] nv50/ir: restructure postraconstantfolding pass

2016-10-27 Thread Samuel Pitoiset
Reviewed-by: Samuel Pitoiset One minor comment below. On 10/09/2016 11:04 AM, Karol Herbst wrote: we might want to add more folding passes here, so make it a bit more generic v2: leave the comment and reword commit message Signed-off-by: Karol Herbst --- .../drivers/nouveau/codegen

[Mesa-dev] [PATCH] nvc0: do not duplicate similar performance metrics

2016-10-31 Thread Samuel Pitoiset
Signed-off-by: Samuel Pitoiset --- .../drivers/nouveau/nvc0/nvc0_query_hw_metric.c| 50 +++--- 1 file changed, 7 insertions(+), 43 deletions(-) diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_query_hw_metric.c b/src/gallium/drivers/nouveau/nvc0/nvc0_query_hw_metric.c

[Mesa-dev] [PATCH] nvc0: simplify the way percentage metrics are computed

2016-10-31 Thread Samuel Pitoiset
Signed-off-by: Samuel Pitoiset --- .../drivers/nouveau/nvc0/nvc0_query_hw_metric.c| 18 -- 1 file changed, 12 insertions(+), 6 deletions(-) diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_query_hw_metric.c b/src/gallium/drivers/nouveau/nvc0/nvc0_query_hw_metric.c

Re: [Mesa-dev] [PATCH] nvc0: simplify the way percentage metrics are computed

2016-10-31 Thread Samuel Pitoiset
On 10/31/2016 02:21 PM, Ilia Mirkin wrote: Does that work? Won't you just end up with 0 all the time? Forgot to return double instead of uint64_t... On Mon, Oct 31, 2016 at 10:16 AM, Samuel Pitoiset wrote: Signed-off-by: Samuel Pitoiset --- .../drivers/nouveau

[Mesa-dev] [PATCH v2] nvc0: simplify the way percentage metrics are computed

2016-10-31 Thread Samuel Pitoiset
v2: - forgot to return double instead of uint64_t Signed-off-by: Samuel Pitoiset --- .../drivers/nouveau/nvc0/nvc0_query_hw_metric.c| 26 +- 1 file changed, 16 insertions(+), 10 deletions(-) diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_query_hw_metric.c b/src

Re: [Mesa-dev] [PATCH v2] nvc0: simplify the way percentage metrics are computed

2016-10-31 Thread Samuel Pitoiset
On 10/31/2016 02:41 PM, Ilia Mirkin wrote: Is that worth it? Now you're getting potentially imprecise return results for int64 counters, just to remove a few *100? We don't have any in64 counters, we do only use UINT64. On Mon, Oct 31, 2016 at 10:37 AM, Samuel Pitoiset

Re: [Mesa-dev] [PATCH 2/2] nv50/ir: generalize sched block sizes and remove duplicated logic

2016-11-02 Thread Samuel Pitoiset
This is a nice refactoring. Reviewed-by: Samuel Pitoiset On 11/02/2016 05:38 AM, Ilia Mirkin wrote: The GM107 had a bunch of prepareEmission needlessly duplicated because the sched block size is different. Move that knowledge into the target, and generalize the existing code. Signed-off-by

Re: [Mesa-dev] [PATCH 1/2] gm107/ir: adjust jump offsets to avoid sched info

2016-11-02 Thread Samuel Pitoiset
This seems like redundant, and because the GM107 emitter already has a bunch of emitXXX() helpers, how about adding emitTARG()? Like: void CodeEmitterGM107::emitTARG() { int32_t pos = insn->target.bb->binPos; if (writeIssueDelays && !(pos & 0x1f)) pos += 8; emitField(0x14, 24, pos - (c

[Mesa-dev] [PATCH 5/6] nvc0: add missing metric-issue_slot on SM35

2016-11-02 Thread Samuel Pitoiset
Signed-off-by: Samuel Pitoiset --- src/gallium/drivers/nouveau/nvc0/nvc0_query_hw_metric.c | 1 + 1 file changed, 1 insertion(+) diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_query_hw_metric.c b/src/gallium/drivers/nouveau/nvc0/nvc0_query_hw_metric.c index 0d7ead3..03a3ff0 100644 --- a

[Mesa-dev] [PATCH 1/6] nvc0: sort performance metrics alphabetically

2016-11-02 Thread Samuel Pitoiset
Signed-off-by: Samuel Pitoiset --- src/gallium/drivers/nouveau/nvc0/nvc0_query_hw_metric.c | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_query_hw_metric.c b/src/gallium/drivers/nouveau/nvc0/nvc0_query_hw_metric.c index 36534ba

[Mesa-dev] [PATCH 0/6] nvc0: some more performance metrics changes

2016-11-02 Thread Samuel Pitoiset
! Samuel Pitoiset (6): nvc0: sort performance metrics alphabetically nvc0: respect 80-chars for perf metrics descriptions nvc0: add new warp_execution_efficiency metric on SM30+ nvc0: do not expose metric-inst_issued twice on SM35 nvc0: add missing metric-issue_slot on SM35 nvc0: add new

[Mesa-dev] [PATCH 6/6] nvc0: add new warp_nonpred_execution_efficiency metric on SM35

2016-11-02 Thread Samuel Pitoiset
Event not_predicated_off_thread_inst_executed is SM35+. Signed-off-by: Samuel Pitoiset --- .../drivers/nouveau/nvc0/nvc0_query_hw_metric.c| 37 +- .../drivers/nouveau/nvc0/nvc0_query_hw_metric.h| 1 + 2 files changed, 37 insertions(+), 1 deletion(-) diff --git a

[Mesa-dev] [PATCH 3/6] nvc0: add new warp_execution_efficiency metric on SM30+

2016-11-02 Thread Samuel Pitoiset
Signed-off-by: Samuel Pitoiset --- .../drivers/nouveau/nvc0/nvc0_query_hw_metric.c| 23 ++ .../drivers/nouveau/nvc0/nvc0_query_hw_metric.h| 1 + 2 files changed, 24 insertions(+) diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_query_hw_metric.c b/src/gallium

[Mesa-dev] [PATCH 4/6] nvc0: do not expose metric-inst_issued twice on SM35

2016-11-02 Thread Samuel Pitoiset
Signed-off-by: Samuel Pitoiset --- src/gallium/drivers/nouveau/nvc0/nvc0_query_hw_metric.c | 1 - 1 file changed, 1 deletion(-) diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_query_hw_metric.c b/src/gallium/drivers/nouveau/nvc0/nvc0_query_hw_metric.c index e5034f7..0d7ead3 100644 --- a/src

[Mesa-dev] [PATCH 2/6] nvc0: respect 80-chars for perf metrics descriptions

2016-11-02 Thread Samuel Pitoiset
Signed-off-by: Samuel Pitoiset --- src/gallium/drivers/nouveau/nvc0/nvc0_query_hw_metric.c | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_query_hw_metric.c b/src/gallium/drivers/nouveau/nvc0/nvc0_query_hw_metric.c index 6f02be3

[Mesa-dev] [PATCH] gm107/ir: emit RED instead of ATOM when no dst

2016-11-04 Thread Samuel Pitoiset
gk106 returned the correct value. Signed-off-by: Samuel Pitoiset --- .../drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp | 29 +- 1 file changed, 28 insertions(+), 1 deletion(-) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp b/src/gallium/drivers

Re: [Mesa-dev] [PATCH] gm107/ir: emit RED instead of ATOM when no dst

2016-11-04 Thread Samuel Pitoiset
On 11/04/2016 07:21 PM, Pierre Moreau wrote: Are reduction doable on shared atomics as well? AFAIK, no. Pierre On 08:08 pm - Nov 04 2016, Samuel Pitoiset wrote: This is similar to NVC0 and GK110 emitters where we emit reduction operations instead of atomic operations when the

[Mesa-dev] [PATCH] nvc0: get rid of NVE4_COMPUTE_MP_PM_{A, B}_SIGSEL_XXX

2016-11-05 Thread Samuel Pitoiset
Instead, hardcode group sigsel because there are a bunch of unknown groups, especially on SM50/SM52. Signed-off-by: Samuel Pitoiset --- .../drivers/nouveau/nvc0/nvc0_query_hw_sm.c| 112 ++--- 1 file changed, 56 insertions(+), 56 deletions(-) diff --git a/src/gallium

Re: [Mesa-dev] [PATCH] nvc0: get rid of NVE4_COMPUTE_MP_PM_{A, B}_SIGSEL_XXX

2016-11-05 Thread Samuel Pitoiset
On 11/05/2016 06:06 PM, Ilia Mirkin wrote: On Sat, Nov 5, 2016 at 12:56 PM, Samuel Pitoiset wrote: Instead, hardcode group sigsel because there are a bunch of unknown groups, especially on SM50/SM52. Signed-off-by: Samuel Pitoiset --- .../drivers/nouveau/nvc0/nvc0_query_hw_sm.c

Re: [Mesa-dev] [PATCH 3/3] nvc0: refactor textures/samplers validation

2016-11-07 Thread Samuel Pitoiset
s, it's more fine-grained texture flushes. I will run piglit on few cards and check elemental on fermi/kepler to be sure the validation is still correct. On Wed, Oct 26, 2016 at 4:00 PM, Samuel Pitoiset wrote: The first goal is to reduce code duplication between 3d and compute and increase r

Re: [Mesa-dev] [PATCH 2/7] nvc0: only update primitive restart for indexed draws

2016-11-07 Thread Samuel Pitoiset
ce GL 4.5 On Tue, Oct 25, 2016 at 3:41 PM, Samuel Pitoiset wrote: Unnecessary to update it at every draw calls, especially for non-indexed draws. This is similar to what nv50 already does. Signed-off-by: Samuel Pitoiset --- src/gallium/drivers/nouveau/nvc0/nvc0_vbo.c | 5 +++-- 1 file change

Re: [Mesa-dev] [PATCH 1/7] nvc0: reduce the number of PUSH_SPACE in draw path

2016-11-07 Thread Samuel Pitoiset
On 11/07/2016 04:32 AM, Ilia Mirkin wrote: On Wed, Oct 26, 2016 at 4:14 AM, Samuel Pitoiset wrote: On 10/25/2016 09:49 PM, Ilia Mirkin wrote: What if instance_count = 1M? (It can happen.) We allocate a giant space in the pushbuf in one shot. Well, anyways this is not the optimization

Re: [Mesa-dev] [PATCH 3/7] nvc0: simplify draw parameters upload for vertex shaders

2016-11-07 Thread Samuel Pitoiset
On 11/07/2016 04:36 AM, Ilia Mirkin wrote: On Tue, Oct 25, 2016 at 3:41 PM, Samuel Pitoiset wrote: Signed-off-by: Samuel Pitoiset --- src/gallium/drivers/nouveau/nvc0/nvc0_vbo.c | 14 ++ 1 file changed, 6 insertions(+), 8 deletions(-) diff --git a/src/gallium/drivers/nouveau

[Mesa-dev] [PATCH v2] nvc0: only invalidate currently bound tic/tsc

2016-11-07 Thread Samuel Pitoiset
This is especially useful when switching from compute to 3D. v2: - get rid of one loop with 'x |= (1ULL << y) - 1' instead Signed-off-by: Samuel Pitoiset --- Tested with Elemental on GK208, works fine. src/gallium/drivers/nouveau/nvc0/nvc0_compute.c | 6 +++--- src/gallium

Re: [Mesa-dev] [PATCH v2] nvc0: only invalidate currently bound tic/tsc

2016-11-07 Thread Samuel Pitoiset
This could be still improved by adding textures/samplers_valid[6] into the context. On 11/07/2016 11:13 PM, Samuel Pitoiset wrote: This is especially useful when switching from compute to 3D. v2: - get rid of one loop with 'x |= (1ULL << y) - 1' instead Signed-off-by:

Re: [Mesa-dev] [PATCH 3/3] nvc0: refactor textures/samplers validation

2016-11-09 Thread Samuel Pitoiset
egressions with piglit on GF108 and GM107. Heaven, Valley and Shadow of Mordor look fine as well. On Wed, Oct 26, 2016 at 4:00 PM, Samuel Pitoiset wrote: The first goal is to reduce code duplication between 3d and compute and increase readability of that area. This refactoring also tries to redu

[Mesa-dev] [PATCH] nv50/ir: make sure to erase src2 after optimizing to MOV

2016-11-09 Thread Samuel Pitoiset
For all instructions with 3 sources (like OP_SLCT), src2 needs to be destroyed because srcExists(2) will return true although it's actually undefined. Spotted with my ADD3 series. Signed-off-by: Samuel Pitoiset --- src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp | 1 + 1 file ch

Re: [Mesa-dev] [PATCH] nv50/ir: make sure to erase src2 after optimizing to MOV

2016-11-09 Thread Samuel Pitoiset
On 11/09/2016 03:58 PM, Ilia Mirkin wrote: On Wed, Nov 9, 2016 at 9:20 AM, Samuel Pitoiset wrote: For all instructions with 3 sources (like OP_SLCT), src2 needs to be destroyed because srcExists(2) will return true although it's actually undefined. Spotted with my ADD3 series. Sounds

Re: [Mesa-dev] [PATCH] nv50/ir: make sure to erase src2 after optimizing to MOV

2016-11-09 Thread Samuel Pitoiset
On 11/09/2016 04:19 PM, Ilia Mirkin wrote: On Wed, Nov 9, 2016 at 10:10 AM, Samuel Pitoiset wrote: On 11/09/2016 03:58 PM, Ilia Mirkin wrote: On Wed, Nov 9, 2016 at 9:20 AM, Samuel Pitoiset wrote: For all instructions with 3 sources (like OP_SLCT), src2 needs to be destroyed because

[Mesa-dev] [PATCH] nvc0: support MP performance counters on Maxwell

2016-11-10 Thread Samuel Pitoiset
hey work fine on the different cards. I have tested with nouveau-git this week, nothing changed. I will report the issue. Please review, Thanks! Samuel Pitoiset (1): nvc0: support MP performance counters on Maxwell .../drivers/nouveau/nvc0/nvc0_query_hw_metric.c| 26 +- .../drivers/nouveau/nvc

[Mesa-dev] [PATCH] nvc0: support MP performance counters on Maxwell

2016-11-10 Thread Samuel Pitoiset
This adds some performance counters/metrics for SM50/SM52. Signed-off-by: Samuel Pitoiset --- .../drivers/nouveau/nvc0/nvc0_query_hw_metric.c| 26 +- .../drivers/nouveau/nvc0/nvc0_query_hw_sm.c| 740 - .../drivers/nouveau/nvc0/nvc0_query_hw_sm.h| 13

Re: [Mesa-dev] [PATCH v2] nvc0/ir: use levelZero flag when the lod is set to 0

2016-11-14 Thread Samuel Pitoiset
On 11/10/2016 03:42 AM, Ilia Mirkin wrote: Signed-off-by: Ilia Mirkin --- v1 -> v2: Move to handling this at SSA time. This is a lot more fragile since the texture arguments have been reordered already, but it's still easy enough to find the LOD argument. .../nouveau/codegen/nv50_ir_l

Re: [Mesa-dev] [PATCH v2] nvc0/ir: use levelZero flag when the lod is set to 0

2016-11-14 Thread Samuel Pitoiset
On 11/14/2016 06:53 PM, Ilia Mirkin wrote: On Mon, Nov 14, 2016 at 12:39 PM, Samuel Pitoiset wrote: On 11/10/2016 03:42 AM, Ilia Mirkin wrote: Signed-off-by: Ilia Mirkin --- v1 -> v2: Move to handling this at SSA time. This is a lot more fragile since the texture arguments h

[Mesa-dev] [PATCH 2/2] gm107/ir: optimize 32-bit CONST load to mov

2016-11-25 Thread Samuel Pitoiset
This is not allowed for indirect accesses because the source GPR might be erased by a subsequent instruction (WaR hazard) if we don't emit a read dep bar. Signed-off-by: Samuel Pitoiset --- .../drivers/nouveau/codegen/nv50_ir_lowering_gm107.cpp | 16 .../drivers/no

[Mesa-dev] [PATCH 1/2] gm107/ir: do not combine CONST loads

2016-11-25 Thread Samuel Pitoiset
iform accesses. I should do something similar when loading from the driver constant buffer but it seems like a bit tricky to handle for now. Signed-off-by: Samuel Pitoiset --- src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp | 9 +++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff

[Mesa-dev] [PATCH] nv50/ir: use OPCLASS_SURFACE for SUSTB

2016-12-11 Thread Samuel Pitoiset
Found by inspection, probably a typo because a surface store is definitely not an atomic operation. Signed-off-by: Samuel Pitoiset --- src/gallium/drivers/nouveau/codegen/nv50_ir_target.cpp | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/gallium/drivers/nouveau/codegen

[Mesa-dev] [PATCH 3/5] nv50/ir: use sched control codes for gm107 builtins

2016-12-22 Thread Samuel Pitoiset
Yes, IMUL/IMAD require dependency barriers and we should definitely replace these instructions by XMAD but the different flags need to be figured out. Note that XMAD only supports 16-bits integers. Signed-off-by: Samuel Pitoiset --- src/gallium/drivers/nouveau/codegen/lib/gm107.asm | 40

[Mesa-dev] [PATCH] nvc0: enable GL 4.3 on gm107+

2016-12-22 Thread Samuel Pitoiset
) and real games like Shadow of Mordor and they all work fine. Signed-off-by: Samuel Pitoiset --- src/gallium/drivers/nouveau/nvc0/nvc0_screen.c | 11 --- 1 file changed, 4 insertions(+), 7 deletions(-) diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c b/src/gallium/drivers/no

[Mesa-dev] [PATCH 0/5] nvc0: better instruction pipelining for Maxwell GPUs

2016-12-22 Thread Samuel Pitoiset
h perf improvements with radeonsi because it already performs really well, unlike Nouveau. But with time and patience we can do better. :-) This series is also available from my fdo account: https://cgit.freedesktop.org/~hakzsam/mesa/log/?h=gm107_scheduler Please, review! Thanks. [1] https

[Mesa-dev] [PATCH 5/5] nvc0: use sched control codes for gm107 MP counters code

2016-12-22 Thread Samuel Pitoiset
Signed-off-by: Samuel Pitoiset --- .../drivers/nouveau/nvc0/nvc0_query_hw_sm.c| 88 +++--- 1 file changed, 44 insertions(+), 44 deletions(-) diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_query_hw_sm.c b/src/gallium/drivers/nouveau/nvc0/nvc0_query_hw_sm.c index

[Mesa-dev] [PATCH 4/5] nvc0: use sched control codes for gm107 blitter shader

2016-12-22 Thread Samuel Pitoiset
Signed-off-by: Samuel Pitoiset Acked-by: Ilia Mirkin --- src/gallium/drivers/nouveau/nvc0/nvc0_surface.c | 20 ++-- 1 file changed, 14 insertions(+), 6 deletions(-) diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_surface.c b/src/gallium/drivers/nouveau/nvc0/nvc0_surface.c

[Mesa-dev] [PATCH 2/5] nv50/ir: improve instruction pipelining on gm107

2016-12-22 Thread Samuel Pitoiset
corner case somewhere. That way, the scheduler is enabled by default but it can be deactivated by using NV50_PROG_SCHED=0. Thanks to Scott Gray for the reverse engineering work available from https://github.com/NervanaSystems/maxas/wiki/Control-Codes. Signed-off-by: Samuel Pitoiset

[Mesa-dev] [PATCH 1/5] nv50/ir: do not insert texture barriers on gm107

2016-12-22 Thread Samuel Pitoiset
It's actually useless to insert those texture barriers post RA because the current control code (ie. st 0x0) will wait for all dependencies before issuing a new instruction. Signed-off-by: Samuel Pitoiset Reviewed-by: Ilia Mirkin --- src/gallium/drivers/nouveau/codegen/nv50_ir_lowering

Re: [Mesa-dev] [PATCH v2 1/3] nouveau: Fix gcc6 / c++11 auto_ptr deprecation compiler warnings

2016-06-29 Thread Samuel Pitoiset
I saw those warnings few weeks ago when I updated to GCC6 as well (but was lazy to fix them). Thanks. Reviewed-by: Samuel Pitoiset On 06/29/2016 02:38 PM, Hans de Goede wrote: Signed-off-by: Hans de Goede --- src/gallium/drivers/nouveau/codegen/nv50_ir_util.h | 4 1 file changed, 4

Re: [Mesa-dev] [PATCH v2 2/3] nouveau: Fix a couple of "foo may be used uninitialized' compiler warnings

2016-06-29 Thread Samuel Pitoiset
On 06/29/2016 03:33 PM, Ilia Mirkin wrote: Since you're the 75th person to send these, I'm going to break down and say "fine, whtvr". I really hate this "pander to stupid compilers" things. If the warning is wrong, my natural inclination would be to disable it (after my even more natural inclin

<    1   2   3   4   5   6   7   8   9   10   >