specifies noperspective
interpolation qualifier, but fragment shader input specifies no
interpolation qualifier
Signed-off-by: Samuel Pitoiset
---
src/mesa/drivers/dri/common/drirc | 4
1 file changed, 4 insertions(+)
diff --git a/src/mesa/drivers/dri/common/drirc
b/src/mesa/drivers/
n RadeonSI.
Signed-off-by: Samuel Pitoiset
---
src/compiler/glsl/ast.h | 16
src/compiler/glsl/ast_to_hir.cpp | 21 ++---
src/compiler/glsl/ast_type.cpp | 12 +++-
src/compiler/glsl/glsl_parser.yy | 12
4 files changed, 29 inserti
Preliminary work for ARB_bindless_texture which can interact
with ARB_shader_image_load_store.
Signed-off-by: Samuel Pitoiset
---
src/compiler/glsl/builtin_variables.cpp | 3 +--
src/compiler/glsl/glsl_parser.yy| 3 +--
src/compiler/glsl/glsl_parser_extras.h | 5 +
3 files changed
Signed-off-by: Samuel Pitoiset
---
src/compiler/glsl/ast_type.cpp | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/src/compiler/glsl/ast_type.cpp b/src/compiler/glsl/ast_type.cpp
index 580d216b30..96d20c10af 100644
--- a/src/compiler/glsl/ast_type.cpp
+++ b/src/compiler/glsl
On 02/23/2017 07:48 PM, Marek Olšák wrote:
From: Marek Olšák
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99850
Cc: 13.0 17.0
---
src/gallium/drivers/radeonsi/si_state_shaders.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/src/gallium/drivers/radeonsi
Hi there,
I started to work on ARB_bindless_texture which is an important missing
feature in Mesa. Some games, at least Deus Ex: Mankind Divided, would
benefit of this extension. As the spec says:
"The ability to access textures without having to bind and/or
re-bind them is similar to the cap
On 02/24/2017 10:07 AM, Andres Gomez wrote:
On Thu, 2017-02-23 at 18:07 +0100, Samuel Pitoiset wrote:
The main idea behind this is to free some bits in the flags.q
struct because currently all 64-bits are used and we can't
add more layout qualifiers without reaching a static assert.
In
ed as function
parameters or in uniform- qualified variables."
Image variables and atomic counters are already rejected in this
situation.
Note that opaque variables can't be treated as l-values, which
means only the 'in' function parameter is allowed.
Signed-off-by: Samuel Pi
On 02/24/2017 12:28 PM, Timothy Arceri wrote:
On 24/02/17 21:50, Samuel Pitoiset wrote:
From section 4.1.7 of the GLSL 4.40 spec:
"The opaque types declare variables that are effectively opaque
handles to other objects. These objects are accessed through
built-in functions
This is similar to what we do in the texture error codepath.
While we are at it, update the specification comment with
latest GL 4.5 spec.
Signed-off-by: Samuel Pitoiset
---
src/mesa/main/samplerobj.c | 139 +
1 file changed, 52 insertions(+), 87
On 02/24/2017 12:29 PM, Samuel Pitoiset wrote:
On 02/24/2017 12:28 PM, Timothy Arceri wrote:
On 24/02/17 21:50, Samuel Pitoiset wrote:
From section 4.1.7 of the GLSL 4.40 spec:
"The opaque types declare variables that are effectively opaque
handles to other objects. These ob
Signed-off-by: Samuel Pitoiset
---
src/compiler/glsl/ast_to_hir.cpp| 22 +++---
src/compiler/glsl/ir.cpp| 4 ++--
src/compiler/glsl/link_uniform_initializers.cpp | 3 +--
src/compiler/glsl_types.cpp | 2 +-
src/mesa
Signed-off-by: Samuel Pitoiset
---
src/compiler/glsl/link_uniform_initializers.cpp | 2 +-
src/compiler/glsl_types.cpp | 3 +--
src/mesa/main/uniform_query.cpp | 2 +-
3 files changed, 3 insertions(+), 4 deletions(-)
diff --git a/src/compiler/glsl
This improves consistency with image variables and atomic
counters which are already rejected the same way.
Note that opaque variables can't be treated as l-values, which
means only the 'in' function parameter is allowed.
v2: rewrite commit message
Signed-off-by: Samuel Pitois
utine".
Signed-off-by: Samuel Pitoiset
---
src/compiler/glsl/ast.h | 1 -
src/compiler/glsl/ast_to_hir.cpp | 6 +++---
src/compiler/glsl/ast_type.cpp | 6 ++
src/compiler/glsl/glsl_parser.yy | 1 -
src/compiler/glsl/glsl_parser_extras.cpp | 2 +
On 02/26/2017 10:19 PM, Timothy Arceri wrote:
On 25/02/17 22:15, Samuel Pitoiset wrote:
This bit is definitely not necessary because subroutine_list
can be used instead. This frees one more bit in the flags.q
struct which is nice because arb_bindless_texture will need
4 bits for the new
On 07/26/2015 06:56 AM, Ilia Mirkin wrote:
Apparently this is necessary in order for tess factors to work in a tess
eval program without a tess control program bound. Probably because it
uses the fake program's shader header to work out the number of patch
constants.
Fixes vs-tes-tessinner-tes
On 07/30/2015 04:26 PM, Marek Olšák wrote:
From: Marek Olšák
---
src/mesa/state_tracker/st_cb_xformfb.c | 58 ++
src/mesa/state_tracker/st_cb_xformfb.h | 2 +-
src/mesa/state_tracker/st_draw.c | 2 +-
3 files changed, 33 insertions(+), 29 deletions
Patches 1-6 are:
Reviewed-by: Samuel Pitoiset
But please, fix the commit message for patches 1 and 3 (ie. gallium/hud
instead of gallium, hud).
Btw, it would be good to display floating point numbers when percentage
is used.
What do you think ?
On 08/03/2015 02:42 PM, Marek Olšák wrote
On 08/03/2015 05:28 PM, Marek Olšák wrote:
On Mon, Aug 3, 2015 at 2:58 PM, Samuel Pitoiset
wrote:
Patches 1-6 are:
Reviewed-by: Samuel Pitoiset
But please, fix the commit message for patches 1 and 3 (ie. gallium/hud
instead of gallium, hud).
Btw, it would be good to display floating
On 08/03/2015 08:14 PM, Marek Olšák wrote:
On Mon, Aug 3, 2015 at 2:58 PM, Samuel Pitoiset
wrote:
Patches 1-6 are:
Reviewed-by: Samuel Pitoiset
But please, fix the commit message for patches 1 and 3 (ie. gallium/hud
instead of gallium, hud).
"gallium,hud" means "gallium
Reviewed-by: Samuel Pitoiset
This fix is simpler than I was expected. What about the edge flag stuff
now? :)
On 08/24/2015 05:51 PM, Ilia Mirkin wrote:
The hardware only generates vertexid when vertices come from a VBO. This
fixes:
vertexid-drawelements
vertexid-drawarrays
Signed
o often.
Good, I'd be happy to have a look at this second approach.
On Mon, Aug 24, 2015 at 4:07 PM, Samuel Pitoiset
wrote:
Reviewed-by: Samuel Pitoiset
This fix is simpler than I was expected. What about the edge flag stuff now?
:)
On 08/24/2015 05:51 PM, Ilia Mirkin wrote:
The hard
Signed-off-by: Samuel Pitoiset
---
src/gallium/drivers/nouveau/nvc0/nvc0_query.c | 56 +-
src/gallium/drivers/nouveau/nvc0/nvc0_screen.h | 56 +-
2 files changed, 56 insertions(+), 56 deletions(-)
diff --git a/src/gallium/drivers/nouveau/nvc0
Signed-off-by: Samuel Pitoiset
---
src/gallium/drivers/nouveau/nvc0/nvc0_screen.h | 20
1 file changed, 20 deletions(-)
diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_screen.h
b/src/gallium/drivers/nouveau/nvc0/nvc0_screen.h
index d8826ae..41008d2 100644
--- a/src
Signed-off-by: Samuel Pitoiset
---
src/gallium/drivers/nouveau/nvc0/nvc0_screen.h | 243 +
1 file changed, 123 insertions(+), 120 deletions(-)
diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_screen.h
b/src/gallium/drivers/nouveau/nvc0/nvc0_screen.h
index 41008d2
.
This option is fine as well of course.
Reviewed-by: Nicolai Hähnle
Thanks!
On 12.09.2016 23:27, Samuel Pitoiset wrote:
When multiple GPUs are plugged in the same box, we might want to
use /dev/dri/renderD129 without updating/compiling the code. This
doesn't change the existing beha
The comment for the commutative flags was wrong because OP_MUL is
before OP_MAD. While we are at it add missing opcodes, and fix
the comment about the short forms.
Signed-off-by: Samuel Pitoiset
---
src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp | 5 +++--
1 file changed, 3
(0.01%)
total local used in shared programs :31872 -> 31872 (0.00%)
localgpr inst bytes
helped 0 0 39 39
hurt 0 26 0 0
Signed-off-by: Samuel Pitoiset
---
.../driver
This is similar to what we already do for MAD/FMA.
Signed-off-by: Samuel Pitoiset
---
src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp | 11 ++-
1 file changed, 10 insertions(+), 1 deletion(-)
diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
b/src/gallium
This instruction is new since SM50 (Maxwell) and allows to perform
an add with three sources. Unfortunately, it only supports integers.
v3: - set commutative flag for OP_ADD3
- move OP_ADD3 after arithmetic ops
Signed-off-by: Samuel Pitoiset
---
src/gallium/drivers/nouveau/codegen
And ADD3(d, a, 0x0, c) to ADD(d, a, c) as well.
v2: - use moveSources()
- allow ADD3 -> ADD when srcFlags is set
Signed-off-by: Samuel Pitoiset
---
src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp | 7 ++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/src/gall
Signed-off-by: Samuel Pitoiset
---
src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp | 8
1 file changed, 8 insertions(+)
diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
index ecde364..246cdff
With OP_ADD3, we might want to swap sources 2 and 1.
Signed-off-by: Samuel Pitoiset
---
.../drivers/nouveau/codegen/nv50_ir_peephole.cpp | 29 ++
1 file changed, 29 insertions(+)
diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
b/src/gallium/drivers
Signed-off-by: Samuel Pitoiset
---
src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp | 11 +++
1 file changed, 11 insertions(+)
diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
index fe815e3..ecde364
Signed-off-by: Samuel Pitoiset
---
.../drivers/nouveau/codegen/nv50_ir_peephole.cpp | 55 ++
1 file changed, 55 insertions(+)
diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
index f212eba
Signed-off-by: Samuel Pitoiset
---
.../drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp | 34 ++
1 file changed, 34 insertions(+)
diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
index cfde66c
This allows to use hexadecimal numbers which are automatically
detected by strtol() when the base is 0.
Signed-off-by: Samuel Pitoiset
---
src/gallium/auxiliary/util/u_debug.c | 25 -
1 file changed, 8 insertions(+), 17 deletions(-)
diff --git a/src/gallium/auxiliary
This adds a new envvar called NOUVEAU_FORCE_CHIPSET which allows
to compile shaders with a different target, especially useful for
shader-db.
Signed-off-by: Samuel Pitoiset
---
src/gallium/drivers/nouveau/nvc0/nvc0_program.c | 19 ++-
1 file changed, 10 insertions(+), 9
Signed-off-by: Samuel Pitoiset
---
src/gallium/drivers/nouveau/codegen/nv50_ir_driver.h | 2 --
1 file changed, 2 deletions(-)
diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_driver.h
b/src/gallium/drivers/nouveau/codegen/nv50_ir_driver.h
index 58a5d38..e85b5fa 100644
--- a/src
Offset was wrong, it's at bit 8, not 4. Also, uses subr instead
of sub when src2 has neg. Similar to GK110 now.
Signed-off-by: Samuel Pitoiset
Cc: mesa-sta...@lists.freedesktop.org
---
src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp | 10 ++
1 file changed, 6 insertions(
On 09/15/2016 06:08 PM, Ilia Mirkin wrote:
On Thu, Sep 15, 2016 at 12:07 PM, Samuel Pitoiset
wrote:
Offset was wrong, it's at bit 8, not 4. Also, uses subr instead
of sub when src2 has neg. Similar to GK110 now.
Signed-off-by: Samuel Pitoiset
Cc: mesa-sta...@lists.freedesktop.org
---
Offset was wrong, it's at bit 8, not 4. Also, uses subr instead
of sub when src2 has neg. Similar to GK110 now.
Signed-off-by: Samuel Pitoiset
Cc: mesa-sta...@lists.freedesktop.org
---
src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp | 10 ++
1 file changed, 6 insertions(
396684 (-0.01%)
total local used in shared programs :34432 -> 34416 (-0.05%)
localgpr inst bytes
helped 1 19 112 112
hurt 0 0 0 0
Signed-off-by: Samuel Pitoiset
---
sr
This should emit src0 instead of src1.
Found by inspection.
Signed-off-by: Samuel Pitoiset
Cc: mesa-sta...@lists.freedesktop.org
---
src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gk110.cpp | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/src/gallium/drivers/nouveau/codegen
Same thing as nvc0_stage_set_sampler_views_range().
Signed-off-by: Samuel Pitoiset
---
src/gallium/drivers/nouveau/nvc0/nvc0_state.c | 83 +++
1 file changed, 9 insertions(+), 74 deletions(-)
diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_state.c
b/src/gallium
This function was quite similar to nvc0_stage_set_sampler_views()
and I don't see any reasons to not remove it.
Signed-off-by: Samuel Pitoiset
---
src/gallium/drivers/nouveau/nvc0/nvc0_state.c | 104 --
1 file changed, 15 insertions(+), 89 deletions(-)
diff --git
This instruction is available since SM20 (Fermi) and allow to do
(a << b) + c in one shot. In some situations, IMAD should be
replaced by SHLADD when b is a power of 2, and ADD+SHL should be
replaced by SHLADD as well.
Signed-off-by: Samuel Pitoiset
---
src/gallium/drivers/nouveau/c
Signed-off-by: Samuel Pitoiset
---
src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp | 8
1 file changed, 8 insertions(+)
diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
index 336f407..1b99ce7
Signed-off-by: Samuel Pitoiset
---
src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp | 3 +++
1 file changed, 3 insertions(+)
diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
index 1b99ce7..75c448e 100644
Only and only if src1 is a power of 2 we can replace IMAD by SHLADD.
Signed-off-by: Samuel Pitoiset
---
src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp | 13 +
1 file changed, 13 insertions(+)
diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
b/src
Unfortunately, we can't use the emit helpers for GF100/GK110
because src1 and src2 are swapped.
Signed-off-by: Samuel Pitoiset
---
.../drivers/nouveau/codegen/nv50_ir_emit_gk110.cpp | 53 ++
.../drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp | 32 +
.../dr
peculiar with some of these library functions.
Cheers,
Nicolai
On 14.09.2016 20:37, Samuel Pitoiset wrote:
This allows to use hexadecimal numbers which are automatically
detected by strtol() when the base is 0.
Signed-off-by: Samuel Pitoiset
---
src/gallium/auxiliary/util/u_debug.c | 25
inst bytes
helped 0 32611051105
hurt 0 55 3 3
Signed-off-by: Samuel Pitoiset
---
src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp | 2 ++
1 file changed, 2 insertions(+)
diff --git a/src/galliu
On 09/20/2016 12:16 AM, Ilia Mirkin wrote:
On Mon, Sep 19, 2016 at 6:11 PM, Samuel Pitoiset
wrote:
This instruction is available since SM20 (Fermi) and allow to do
(a << b) + c in one shot. In some situations, IMAD should be
replaced by SHLADD when b is a power of 2, and ADD+SHL sho
v2: - update formatting spec quotations (Ian)
- move the total_invocations check outside of the loop (Ian)
Signed-off-by: Samuel Pitoiset
---
src/mesa/main/api_validate.c | 96
src/mesa/main/api_validate.h | 4 ++
src/mesa/main/compute.c
This is the new layout qualifier introduced by
ARB_compute_variable_group_size which allows to use a variable work
group size.
Signed-off-by: Samuel Pitoiset
Reviewed-by: Ian Romanick
---
src/compiler/glsl/ast.h | 5 +
src/compiler/glsl/ast_type.cpp | 6
ing needs to be slighty updated.
Please review,
Thanks!
Samuel Pitoiset (14):
glapi: add entry points for GL_ARB_compute_variable_group_size
mesa/main: add support for ARB_compute_variable_groups_size
glsl: add enable flags for ARB_compute_variable_group_size
glsl: process local_size_variable
v2: - correctly sort that new extension (Ian)
- fix up the comment (Ian)
Signed-off-by: Samuel Pitoiset
Reviewed-by: Ian Romanick
---
.../glapi/gen/ARB_compute_variable_group_size.xml | 25 ++
src/mapi/glapi/gen/Makefile.am | 1 +
src/mapi/glapi
Compute shaders can now include a fixed local size as defined by
ARB_compute_shader or a variable size as defined by
ARB_compute_variable_group_size.
v2: - update formatting spec quotations (Ian)
- various cosmetic changes (Ian)
Signed-off-by: Samuel Pitoiset
Reviewed-by: Ian Romanick
The ARB_compute_variable_group_size specification explains that
when a compute shader includes both a fixed and a variable local
size, a compile-time error occurs.
v2: - update formatting spec quotations (Ian)
Signed-off-by: Samuel Pitoiset
---
src/compiler/glsl/ast_to_hir.cpp | 14
v2: - only add it if the ext is enabled (Ilia)
Signed-off-by: Samuel Pitoiset
Reviewed-by: Ian Romanick
---
src/compiler/glsl/builtin_variables.cpp | 6 ++
src/compiler/shader_enums.h | 1 +
2 files changed, 7 insertions(+)
diff --git a/src/compiler/glsl/builtin_variables.cpp
Let's return the same number of threads per block for both fixed and
variable sizes.
Signed-off-by: Samuel Pitoiset
---
src/gallium/drivers/nouveau/nvc0/nvc0_screen.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c
Signed-off-by: Samuel Pitoiset
Reviewed-by: Marek Olšák
---
src/mesa/state_tracker/st_cb_compute.c | 15 ---
1 file changed, 12 insertions(+), 3 deletions(-)
diff --git a/src/mesa/state_tracker/st_cb_compute.c
b/src/mesa/state_tracker/st_cb_compute.c
index 88c1ee2..ccc5dc2 100644
This also initializes the default values for the standalone compiler.
Signed-off-by: Samuel Pitoiset
Reviewed-by: Ian Romanick
---
src/compiler/glsl/glsl_parser_extras.cpp | 1 +
src/compiler/glsl/glsl_parser_extras.h | 2 ++
src/compiler/glsl/standalone.cpp | 4
src
This extension is only exposed if the underlying driver supports
ARB_compute_shader and if PIPE_COMPUTE_MAX_VARIABLE_THREADS_PER_BLOCK
is set.
v3: - initialize max_variable_threads_per_block to 0
v2: - expose the ext based on that new cap
Signed-off-by: Samuel Pitoiset
---
src/mesa
Signed-off-by: Samuel Pitoiset
---
docs/features.txt | 2 +-
docs/relnotes/12.1.0.html | 1 +
2 files changed, 2 insertions(+), 1 deletion(-)
diff --git a/docs/features.txt b/docs/features.txt
index fbb3952..6cc429a 100644
--- a/docs/features.txt
+++ b/docs/features.txt
@@ -279,7 +279,7
gl_LocalGroupSizeARB can be translated into TGSI_SEMANTIC_BLOCK_SIZE
which represents the block size in threads.
Signed-off-by: Samuel Pitoiset
Reviewed-by: Marek Olšák
---
src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 2 ++
1 file changed, 2 insertions(+)
diff --git a/src/mesa/state_tracker
v3: - use a new case statement in r600_pipe_common.c
- fix compilation of softpipe...
Signed-off-by: Samuel Pitoiset
---
src/gallium/docs/source/screen.rst | 4
src/gallium/drivers/ilo/ilo_screen.c | 2 ++
src/gallium/drivers/nouveau/nv50/nv50_screen.c | 2 ++
src
When a variable local size is defined as specified by
ARB_compute_variable_group_size, the fixed local size is set to 0
and a SIGFPE occurs when we compute the maximum number of regs.
Signed-off-by: Samuel Pitoiset
---
src/gallium/drivers/nouveau/codegen/nv50_ir_target.h | 3 ++-
1 file changed
an be done just after the mesa/gallium
bits are upstream. :)
On Mon, Sep 26, 2016 at 1:23 PM, Samuel Pitoiset
wrote:
Let's return the same number of threads per block for both fixed and
variable sizes.
Signed-off-by: Samuel Pitoiset
---
src/gallium/drivers/nouveau/nvc0/nvc0_screen.c
inst bytes
helped 0 32611051105
hurt 0 55 3 3
Signed-off-by: Samuel Pitoiset
---
src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp | 2 ++
1 file changed, 2 insertions(+)
diff --git a/src/galliu
Signed-off-by: Samuel Pitoiset
---
src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp | 8
1 file changed, 8 insertions(+)
diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
index c9d5b5f..cbbe34d
This instruction is available since SM20 (Fermi) and allow to do
(a << b) + c in one shot. In some situations, IMAD should be
replaced by SHLADD when b is a power of 2, and ADD+SHL should be
replaced by SHLADD as well.
v2: - fix up the commutative table on nv50/ir
Signed-off-by: Samuel Pi
Only and only if src1 is a power of 2 we can replace IMAD by SHLADD.
v2: - use non-negative values and use applyLog2()
Signed-off-by: Samuel Pitoiset
---
src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp | 7 +++
1 file changed, 7 insertions(+)
diff --git a/src/gallium/drivers
Unfortunately, we can't use the emit helpers for GF100/GK110
because src1 and src2 are swapped.
v2: - s/emitSHLADD/emitISCADD for GM107 emitter
Signed-off-by: Samuel Pitoiset
---
.../drivers/nouveau/codegen/nv50_ir_emit_gk110.cpp | 53 ++
.../drivers/nouveau/co
Signed-off-by: Samuel Pitoiset
---
src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp | 3 +++
1 file changed, 3 insertions(+)
diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
index cbbe34d..9875738 100644
teach isModSupported() about SHLADD
v2: - fix up the commutative table on nv50/ir
Signed-off-by: Samuel Pitoiset
---
src/gallium/drivers/nouveau/codegen/nv50_ir.h | 1 +
src/gallium/drivers/nouveau/codegen/nv50_ir_print.cpp | 1 +
src/gallium/drivers/nouveau/c
Unfortunately, we can't use the emit helpers for GF100/GK110
because src1 and src2 are swapped.
v3: - remove useless use of src1 neg mod
v2: - s/emitSHLADD/emitISCADD for GM107 emitter
Signed-off-by: Samuel Pitoiset
---
.../drivers/nouveau/codegen/nv50_ir_emit_gk110.cpp
envyas now uses a much better representation for those control
codes and it displays the different flags instead of an
unreadable hex number.
Signed-off-by: Samuel Pitoiset
---
src/gallium/drivers/nouveau/codegen/lib/gm107.asm | 42 +++
src/gallium/drivers/nouveau/nvc0
shaderdb runner fails at parsing shader_test files when the first
line inside the require block is not 'GLSL >= x.y'. This just skips
the GL version requirement which is actually unused and allows to
compile +164 shaders from piglit.
---
run.c | 6 ++
1 file changed, 6 insertions(+)
diff --gi
Currently, program binaries are only dumped at upload time, but
when the chipset has been forced via NV50_PROG_CHIPSET we might
want to show the generated code, especially with shaderdb.
Signed-off-by: Samuel Pitoiset
---
src/gallium/drivers/nouveau/nvc0/nvc0_program.c | 5 +
1 file changed
On 10/03/2016 06:55 PM, Karol Herbst wrote:
fixes a crash in the case simplify reports an error
Signed-off-by: Karol Herbst
---
src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp | 12 +++-
1 file changed, 7 insertions(+), 5 deletions(-)
diff --git a/src/gallium/drivers/nouveau/code
On 10/04/2016 10:11 AM, Nicolai Hähnle wrote:
From: Nicolai Hähnle
The difference to the virtually identical ARB_robustness (which is already
enabled unconditionally) is miniscule and handled elsewhere, but this caps
seems like the right thing to require for this extension.
I guess you migh
On 10/04/2016 11:18 AM, Nicolai Hähnle wrote:
On 02.10.2016 16:27, Samuel Pitoiset wrote:
shaderdb runner fails at parsing shader_test files when the first
line inside the require block is not 'GLSL >= x.y'. This just skips
the GL version requirement which is actually unused
On 09/27/2016 08:58 PM, Nicolai Hähnle wrote:
On 26.09.2016 19:23, Samuel Pitoiset wrote:
v2: - update formatting spec quotations (Ian)
- move the total_invocations check outside of the loop (Ian)
Signed-off-by: Samuel Pitoiset
---
src/mesa/main/api_validate.c | 96
On 09/27/2016 09:12 PM, Nicolai Hähnle wrote:
On 26.09.2016 19:23, Samuel Pitoiset wrote:
This is the new layout qualifier introduced by
ARB_compute_variable_group_size which allows to use a variable work
group size.
Signed-off-by: Samuel Pitoiset
Reviewed-by: Ian Romanick
---
src
On 09/27/2016 09:15 PM, Nicolai Hähnle wrote:
On 26.09.2016 19:23, Samuel Pitoiset wrote:
v3: - use a new case statement in r600_pipe_common.c
- fix compilation with softpipe
- initialize max_variable_threads_per_block to 0
I have sent some remarks on patches 2 and 4. Patches 1, 3
This is the new layout qualifier introduced by
ARB_compute_variable_group_size which allows to use a variable work
group size.
v4: - add missing '%s' in the monster format string
Signed-off-by: Samuel Pitoiset
Reviewed-by: Ian Romanick
---
src/compiler/glsl/ast.h
This also initializes the default values for the standalone compiler.
Signed-off-by: Samuel Pitoiset
Reviewed-by: Ian Romanick
Reviewed-by: Nicolai Hähnle
---
src/compiler/glsl/glsl_parser_extras.cpp | 1 +
src/compiler/glsl/glsl_parser_extras.h | 2 ++
src/compiler/glsl
total_invocations check outside of the loop (Ian)
Signed-off-by: Samuel Pitoiset
fix patch 2
---
src/mesa/main/api_validate.c | 111 +++
src/mesa/main/api_validate.h | 4 ++
src/mesa/main/compute.c | 17 ++
src/mesa/main/context.c
Compute shaders can now include a fixed local size as defined by
ARB_compute_shader or a variable size as defined by
ARB_compute_variable_group_size.
v2: - update formatting spec quotations (Ian)
- various cosmetic changes (Ian)
Signed-off-by: Samuel Pitoiset
Reviewed-by: Ian Romanick
The ARB_compute_variable_group_size specification explains that
when a compute shader includes both a fixed and a variable local
size, a compile-time error occurs.
v2: - update formatting spec quotations (Ian)
Signed-off-by: Samuel Pitoiset
Reviewed-by: Nicolai Hähnle
---
src/compiler/glsl
v2: - correctly sort that new extension (Ian)
- fix up the comment (Ian)
Signed-off-by: Samuel Pitoiset
Reviewed-by: Ian Romanick
Reviewed-by: Nicolai Hähnle
---
.../glapi/gen/ARB_compute_variable_group_size.xml | 25 ++
src/mapi/glapi/gen/Makefile.am
if something needs to be slighty updated.
Please review,
Thanks!
Samuel Pitoiset (14):
glapi: add entry points for GL_ARB_compute_variable_group_size
mesa/main: add support for ARB_compute_variable_groups_size
glsl: add enable flags for ARB_compute_variable_group_size
glsl: process
When a variable local size is defined as specified by
ARB_compute_variable_group_size, the fixed local size is set to 0
and a SIGFPE occurs when we compute the maximum number of regs.
This allows to use 64 GPRs/thread.
v4: - use 512 threads on Fermi, 1024 on Kepler+
Signed-off-by: Samuel
gl_LocalGroupSizeARB can be translated into TGSI_SEMANTIC_BLOCK_SIZE
which represents the block size in threads.
Signed-off-by: Samuel Pitoiset
Reviewed-by: Marek Olšák
Reviewed-by: Nicolai Hähnle
---
src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 2 ++
1 file changed, 2 insertions(+)
diff
Signed-off-by: Samuel Pitoiset
Reviewed-by: Marek Olšák
---
docs/features.txt | 2 +-
docs/relnotes/12.1.0.html | 1 +
2 files changed, 2 insertions(+), 1 deletion(-)
diff --git a/docs/features.txt b/docs/features.txt
index 85ad1a1..12f0f25 100644
--- a/docs/features.txt
+++ b/docs
Only expose 512 threads/block on Fermi to not be limited by
32 GPRs/thread.
v4: - use 512 threads on Fermi, 2014 on Kepler+
Signed-off-by: Samuel Pitoiset
---
src/gallium/drivers/nouveau/nvc0/nvc0_screen.c | 8 ++--
1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/src/gallium
Signed-off-by: Samuel Pitoiset
Reviewed-by: Marek Olšák
Reviewed-by: Nicolai Hähnle
---
src/mesa/state_tracker/st_cb_compute.c | 15 ---
1 file changed, 12 insertions(+), 3 deletions(-)
diff --git a/src/mesa/state_tracker/st_cb_compute.c
b/src/mesa/state_tracker/st_cb_compute.c
201 - 300 of 5029 matches
Mail list logo