[Mesa-dev] [PATCH] radv: avoid context rolls when binding graphics pipelines

2019-01-14 Thread Rhys Perry
m the previous pipeline's. Signed-off-by: Rhys Perry --- src/amd/vulkan/radv_cmd_buffer.c | 46 +-- src/amd/vulkan/radv_pipeline.c | 217 --- src/amd/vulkan/radv_private.h| 2 + 3 files changed, 150 insertions(+), 115 deletions(-) diff --git a/src/

Re: [Mesa-dev] [PATCH] radv: avoid context rolls when binding graphics pipelines

2019-01-14 Thread Rhys Perry
on, 14 Jan 2019 at 16:05, Samuel Pitoiset wrote: > > Did you benchmark? > > On 1/14/19 5:01 PM, Rhys Perry wrote: > > It's common in some applications to bind a new graphics pipeline without > > ending up changing any context registers. > > > > This has

Re: [Mesa-dev] [PATCH] radv: avoid context rolls when binding graphics pipelines

2019-01-14 Thread Rhys Perry
This is with Rise of the Tomb Raider's graphics settings set to "High" by the way. On Mon, 14 Jan 2019 at 16:12, Rhys Perry wrote: > > I did and found small improvements in Rise of the Tomb Raider. I > measured framerates ~104.3% that of without the changes for the &

Re: [Mesa-dev] [PATCH] radv: avoid context rolls when binding graphics pipelines

2019-01-14 Thread Rhys Perry
ster changes a lot more. > > Not sure if that will improve anything though, but I think it's worth to > try? > > On 1/14/19 5:12 PM, Rhys Perry wrote: > > I did and found small improvements in Rise of the Tomb Raider. I > > measured framerates ~104.3% that of witho

[Mesa-dev] [PATCH] radv: prevent dirtying of dynamic state when it does not change

2019-01-15 Thread Rhys Perry
DXVK often sets dynamic state without actually changing it. Signed-off-by: Rhys Perry --- src/amd/vulkan/radv_cmd_buffer.c | 92 ++-- 1 file changed, 76 insertions(+), 16 deletions(-) diff --git a/src/amd/vulkan/radv_cmd_buffer.c b/src/amd/vulkan/radv_cmd_buffer.c

Re: [Mesa-dev] [PATCH] radv: prevent dirtying of dynamic state when it does not change

2019-01-15 Thread Rhys Perry
I misread some code and forgot to remove it. It was always unrelated to this patch. On Wed, 16 Jan 2019 at 00:22, Bas Nieuwenhuizen wrote: > > On Tue, Jan 15, 2019 at 10:59 PM Rhys Perry wrote: > > > > DXVK often sets dynamic state without actually changing it. > >

Re: [Mesa-dev] [PATCH] radv: avoid context rolls when binding graphics pipelines

2019-01-15 Thread Rhys Perry
do multiple runs of Rise of the Tomb Raider tomorrow and see if I get anything too different. On Wed, 16 Jan 2019 at 00:25, Bas Nieuwenhuizen wrote: > > On Mon, Jan 14, 2019 at 5:12 PM Rhys Perry wrote: > > > > I did and found small improvements in Rise of the Tomb

Re: [Mesa-dev] [PATCH] radv: avoid context rolls when binding graphics pipelines

2019-01-16 Thread Rhys Perry
n Wed, 16 Jan 2019 at 00:39, Rhys Perry wrote: > > I did a before/after comparison during development with multiple runs > but only 1 before and after run to produce the numbers I sent. They > seemed to match up well enough to the runs during development, so I > wasn't too conc

Re: [Mesa-dev] [PATCH] radv: avoid context rolls when binding graphics pipelines

2019-01-16 Thread Rhys Perry
1%) (1 extreme from "before" run excluded) Sorry for the noise. On Wed, 16 Jan 2019 at 11:46, Rhys Perry wrote: > > Rise of the Tomb Raider from without to with the change (average of 3 runs): > SpineOfTheMountain: 73.46667 fps -> 73.56667 fps (+0.14%) > Pro

[Mesa-dev] [PATCH v3 3/5] st/mesa: add support for EXT_shader_image_load_formatted

2019-01-16 Thread Rhys Perry
v3: rebase Signed-off-by: Rhys Perry Reviewed-by: Marek Olšák (v2) --- src/mesa/state_tracker/st_extensions.c | 1 + 1 file changed, 1 insertion(+) diff --git a/src/mesa/state_tracker/st_extensions.c b/src/mesa/state_tracker/st_extensions.c index 4628079260..b713eed969 100644 --- a/src/mesa

[Mesa-dev] [PATCH v3 1/5] gallium: add support for formatted image loads

2019-01-16 Thread Rhys Perry
v3: rebase v3: make use of u_pipe_screen_get_param_defaults Signed-off-by: Rhys Perry --- src/gallium/auxiliary/util/u_screen.c | 1 + src/gallium/docs/source/screen.rst | 1 + src/gallium/drivers/nouveau/nv30/nv30_screen.c | 1 + src/gallium/drivers/nouveau/nv50

[Mesa-dev] [PATCH v3 0/5] nvc0: Implement EXT_shader_image_load_formatted

2019-01-16 Thread Rhys Perry
p v3: rebase v3: make use of u_pipe_screen_get_param_defaults v3: move RA code into it's own function Rhys Perry (5): gallium: add support for formatted image loads mesa,glsl: add support for EXT_shader_image_load_formatted st/mesa: add support for EXT_shader_image_load_formatted nv

[Mesa-dev] [PATCH v3 5/5] nvc0, nv50/ir: enable support for formatted image loads on GM107+

2019-01-16 Thread Rhys Perry
v3: rebase Signed-off-by: Rhys Perry --- src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp | 3 +-- src/gallium/drivers/nouveau/nvc0/nvc0_screen.c| 3 ++- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/src/gallium/drivers/nouveau/codegen

[Mesa-dev] [PATCH v3 2/5] mesa, glsl: add support for EXT_shader_image_load_formatted

2019-01-16 Thread Rhys Perry
v3: rebase Signed-off-by: Rhys Perry Reviewed-by: Marek Olšák (v2) --- src/compiler/glsl/ast_to_hir.cpp | 5 + src/compiler/glsl/glsl_parser_extras.cpp | 1 + src/compiler/glsl/glsl_parser_extras.h | 7 +++ src/mesa/main/extensions_table.h | 1 + src/mesa/main

[Mesa-dev] [PATCH v3 4/5] nv50/ir: use suld.p on GM107+

2019-01-16 Thread Rhys Perry
v3: rebase v3: move RA code into it's own function Signed-off-by: Rhys Perry --- src/gallium/drivers/nouveau/codegen/nv50_ir.h | 4 +++ .../nouveau/codegen/nv50_ir_emit_gm107.cpp| 34 --- .../drivers/nouveau/codegen/nv50_ir_print.cpp | 17 ++ .../drivers/no

[Mesa-dev] [PATCH 1/2] radv: pass radv_draw_info to radv_emit_draw_registers()

2019-01-19 Thread Rhys Perry
Signed-off-by: Rhys Perry --- src/amd/vulkan/radv_cmd_buffer.c | 118 +++ 1 file changed, 58 insertions(+), 60 deletions(-) diff --git a/src/amd/vulkan/radv_cmd_buffer.c b/src/amd/vulkan/radv_cmd_buffer.c index f41d6c0b3e7..f430b4f20dd 100644 --- a/src/amd/vulkan

[Mesa-dev] [PATCH] radv: avoid context rolls when binding graphics pipelines

2019-01-19 Thread Rhys Perry
m the previous pipeline's. v2: ensure late scissor emission is done when radv_emit_rbplus_state() is called v2: make use of cmd_buffer->state.workaround_scissor_bug Signed-off-by: Rhys Perry --- This second version depends on the patch "radv: add missed situations for scis

[Mesa-dev] [PATCH 2/2] radv: add missed situations for scissor bug workaround

2019-01-19 Thread Rhys Perry
Signed-off-by: Rhys Perry --- src/amd/vulkan/radv_cmd_buffer.c | 65 src/amd/vulkan/radv_private.h| 2 + 2 files changed, 43 insertions(+), 24 deletions(-) diff --git a/src/amd/vulkan/radv_cmd_buffer.c b/src/amd/vulkan/radv_cmd_buffer.c index f430b4f20dd

Re: [Mesa-dev] [PATCH 00/38] radv, ac: 16-bit and 8-bit arithmetic and 8-bit storage

2019-02-12 Thread Rhys Perry
t wrote: > > What's the status of this? > > On 12/7/18 6:21 PM, Rhys Perry wrote: > > This series add support for: > > - VK_KHR_shader_float16_int8 > > - VK_AMD_gpu_shader_half_float > > - VK_AMD_gpu_shader_int16 > > - VK_KHR_8bit_storage > &

Re: [Mesa-dev] [PATCH 00/38] radv, ac: 16-bit and 8-bit arithmetic and 8-bit storage

2019-02-13 Thread Rhys Perry
itting this series in four different parts? One for every > extension? Is this doable without too much troubles? > > On 2/12/19 6:02 PM, Rhys Perry wrote: > > It currently requires review (and possibly rebasing). Marek Olšák send > > some feedback for a few of the patches but

[Mesa-dev] [PATCH v2 01/41] radv: bitcast 16-bit outputs to integers

2019-02-15 Thread Rhys Perry
16-bit outputs are stored as 16-bit floats in the outputs array, so they have to be bitcast. Fixes: b722b29f10d ('radv: add support for 16bit input/output') Signed-off-by: Rhys Perry --- src/amd/vulkan/radv_nir_to_llvm.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) di

[Mesa-dev] [PATCH v2 06/41] ac/nir: fix 16-bit ssbo stores

2019-02-15 Thread Rhys Perry
Signed-off-by: Rhys Perry --- src/amd/common/ac_nir_to_llvm.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c index 89a78b43c6f..b260142c177 100644 --- a/src/amd/common/ac_nir_to_llvm.c +++ b/src/amd/common/ac_nir_to_llvm.c

[Mesa-dev] [PATCH 00/38] radv, ac: 16-bit and 8-bit arithmetic and 8-bit storage

2019-02-15 Thread Rhys Perry
v2: fix C++ style comment Rhys Perry (41): radv: bitcast 16-bit outputs to integers radv: ensure export arguments are always float ac: add various helpers for float16/int16/int8 ac/nir: implement 8-bit push constant, ssbo and ubo loads ac/nir: implement 8-bit ssbo stores ac/nir: fix 1

[Mesa-dev] [PATCH v2 04/41] ac/nir: implement 8-bit push constant, ssbo and ubo loads

2019-02-15 Thread Rhys Perry
Signed-off-by: Rhys Perry --- src/amd/common/ac_nir_to_llvm.c | 37 +++-- 1 file changed, 31 insertions(+), 6 deletions(-) diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c index bed52490bad..17d952d1ae8 100644 --- a/src/amd/common

[Mesa-dev] [PATCH v2 02/41] radv: ensure export arguments are always float

2019-02-15 Thread Rhys Perry
: add support for 16bit input/output') Signed-off-by: Rhys Perry --- src/amd/vulkan/radv_nir_to_llvm.c | 6 +- 1 file changed, 1 insertion(+), 5 deletions(-) diff --git a/src/amd/vulkan/radv_nir_to_llvm.c b/src/amd/vulkan/radv_nir_to_llvm.c index a8268c44ecf..d3795eec403 100644 --- a/src/

[Mesa-dev] [PATCH v2 07/41] ac/nir: implement 8-bit nir_load_const_instr

2019-02-15 Thread Rhys Perry
Signed-off-by: Rhys Perry --- src/amd/common/ac_nir_to_llvm.c | 4 1 file changed, 4 insertions(+) diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c index b260142c177..f39232b91a1 100644 --- a/src/amd/common/ac_nir_to_llvm.c +++ b/src/amd/common

[Mesa-dev] [PATCH v2 03/41] ac: add various helpers for float16/int16/int8

2019-02-15 Thread Rhys Perry
v2: remove ac_get_one(), ac_get_zero(), ac_get_onef() and ac_get_zerof() v2: remove ac_int_of_size() Signed-off-by: Rhys Perry --- src/amd/common/ac_llvm_build.c | 55 ++--- src/amd/common/ac_llvm_build.h | 15 +++-- src/amd/common/ac_nir_to_llvm.c | 30

[Mesa-dev] [PATCH v2 09/41] ac/nir: fix 64-bit nir_op_f2f16_rtz

2019-02-15 Thread Rhys Perry
Signed-off-by: Rhys Perry --- src/amd/common/ac_nir_to_llvm.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c index 691d444db05..741059b5f1a 100644 --- a/src/amd/common/ac_nir_to_llvm.c +++ b/src/amd/common/ac_nir_to_llvm.c

[Mesa-dev] [PATCH v2 05/41] ac/nir: implement 8-bit ssbo stores

2019-02-15 Thread Rhys Perry
Signed-off-by: Rhys Perry --- src/amd/common/ac_nir_to_llvm.c | 22 -- 1 file changed, 20 insertions(+), 2 deletions(-) diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c index 17d952d1ae8..89a78b43c6f 100644 --- a/src/amd/common/ac_nir_to_llvm.c

[Mesa-dev] [PATCH v2 13/41] ac/nir: make ac_build_fsign work on all bit sizes

2019-02-15 Thread Rhys Perry
v2: don't use ac_get_zerof() and ac_get_onef() Signed-off-by: Rhys Perry --- src/amd/common/ac_llvm_build.c | 16 1 file changed, 4 insertions(+), 12 deletions(-) diff --git a/src/amd/common/ac_llvm_build.c b/src/amd/common/ac_llvm_build.c index 3b2257e8bf0..23e454385d7 1

[Mesa-dev] [PATCH v2 20/41] ac/nir: make emit_b2i work on all bit sizes

2019-02-15 Thread Rhys Perry
v2: don't use ac_int_of_size() Signed-off-by: Rhys Perry --- src/amd/common/ac_nir_to_llvm.c | 6 +- 1 file changed, 1 insertion(+), 5 deletions(-) diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c index e459001c1cf..75bb19031bf 100644 --- a/src/amd/c

[Mesa-dev] [PATCH v2 08/41] ac/nir: implement 8-bit conversions

2019-02-15 Thread Rhys Perry
Signed-off-by: Rhys Perry --- src/amd/common/ac_nir_to_llvm.c | 9 + 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c index f39232b91a1..691d444db05 100644 --- a/src/amd/common/ac_nir_to_llvm.c +++ b/src/amd

[Mesa-dev] [PATCH v2 29/41] ac/nir: make ac_build_bit_count work on all bit sizes

2019-02-15 Thread Rhys Perry
Signed-off-by: Rhys Perry --- src/amd/common/ac_llvm_build.c | 33 +++-- 1 file changed, 7 insertions(+), 26 deletions(-) diff --git a/src/amd/common/ac_llvm_build.c b/src/amd/common/ac_llvm_build.c index c986f800fa4..46738faea9d 100644 --- a/src/amd/common

[Mesa-dev] [PATCH v2 36/41] radv: handle all fragment output types

2019-02-15 Thread Rhys Perry
Signed-off-by: Rhys Perry --- src/amd/vulkan/radv_nir_to_llvm.c | 55 --- 1 file changed, 35 insertions(+), 20 deletions(-) diff --git a/src/amd/vulkan/radv_nir_to_llvm.c b/src/amd/vulkan/radv_nir_to_llvm.c index 01b8b097ea1..c46eabf3656 100644 --- a/src/amd/vulkan

[Mesa-dev] [PATCH v2 22/41] compiler/nir: add lowering option for 16-bit ffma

2019-02-15 Thread Rhys Perry
The lowering needs to be disabled for sufficient precision to pass deqp-vk's 16-bit fma test on radv. Signed-off-by: Rhys Perry --- src/broadcom/compiler/nir_to_vir.c| 1 + src/compiler/nir/nir.h| 1 + src/compiler/nir/nir_opt_algebraic.py | 4 +++- src/gallium/dr

[Mesa-dev] [PATCH v2 27/41] ac/nir: make ac_build_umsb work on all bit sizes

2019-02-15 Thread Rhys Perry
v2: don't use ac_get_zero() and ac_int_of_size() Signed-off-by: Rhys Perry --- src/amd/common/ac_llvm_build.c | 38 +++--- 1 file changed, 7 insertions(+), 31 deletions(-) diff --git a/src/amd/common/ac_llvm_build.c b/src/amd/common/ac_llvm_build.c index 61085d

[Mesa-dev] [PATCH v2 24/41] ac/nir: implement 8 and 16 bit ac_build_readlane

2019-02-15 Thread Rhys Perry
v2: don't use ac_int_of_size() Signed-off-by: Rhys Perry --- src/amd/common/ac_llvm_build.c | 12 +++- 1 file changed, 11 insertions(+), 1 deletion(-) diff --git a/src/amd/common/ac_llvm_build.c b/src/amd/common/ac_llvm_build.c index 71eaac4b7bd..aa92c55c822 100644 --- a/src/amd/c

[Mesa-dev] [PATCH v2 30/41] ac/nir: make ac_build_bitfield_reverse work on all bit sizes

2019-02-15 Thread Rhys Perry
Signed-off-by: Rhys Perry --- src/amd/common/ac_llvm_build.c | 26 ++ 1 file changed, 6 insertions(+), 20 deletions(-) diff --git a/src/amd/common/ac_llvm_build.c b/src/amd/common/ac_llvm_build.c index 46738faea9d..dff369aae7f 100644 --- a/src/amd/common/ac_llvm_build.c

[Mesa-dev] [PATCH v2 23/41] ac/nir: implement 16-bit ac_build_ddxy

2019-02-15 Thread Rhys Perry
v2: rebase Signed-off-by: Rhys Perry --- src/amd/common/ac_llvm_build.c | 20 1 file changed, 16 insertions(+), 4 deletions(-) diff --git a/src/amd/common/ac_llvm_build.c b/src/amd/common/ac_llvm_build.c index fb871a47400..71eaac4b7bd 100644 --- a/src/amd/common

[Mesa-dev] [PATCH v2 17/41] ac/nir: implement half-float nir_op_ldexp

2019-02-15 Thread Rhys Perry
Signed-off-by: Rhys Perry --- src/amd/common/ac_nir_to_llvm.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c index 8b0e07d2930..0e5946dfdb3 100644 --- a/src/amd/common/ac_nir_to_llvm.c +++ b/src/amd/common

[Mesa-dev] [PATCH v2 26/41] ac/nir: make ac_find_lsb work on all bit sizes

2019-02-15 Thread Rhys Perry
v2: don't use ac_get_zero() and ac_int_of_size() Signed-off-by: Rhys Perry --- src/amd/common/ac_llvm_build.c | 33 ++--- 1 file changed, 6 insertions(+), 27 deletions(-) diff --git a/src/amd/common/ac_llvm_build.c b/src/amd/common/ac_llvm_build.c index aa92c5

[Mesa-dev] [PATCH v2 28/41] ac/nir: implement 8 and 16 bit ac_build_imsb

2019-02-15 Thread Rhys Perry
v2: fix C++ style comment Signed-off-by: Rhys Perry --- src/amd/common/ac_llvm_build.c | 4 1 file changed, 4 insertions(+) diff --git a/src/amd/common/ac_llvm_build.c b/src/amd/common/ac_llvm_build.c index ec87a7b9343..c986f800fa4 100644 --- a/src/amd/common/ac_llvm_build.c +++ b/src/amd

[Mesa-dev] [PATCH v2 10/41] ac/nir: make ac_build_clamp work on all bit sizes

2019-02-15 Thread Rhys Perry
v2: don't use ac_get_zerof() and ac_get_onef() Signed-off-by: Rhys Perry --- src/amd/common/ac_llvm_build.c | 13 + 1 file changed, 9 insertions(+), 4 deletions(-) diff --git a/src/amd/common/ac_llvm_build.c b/src/amd/common/ac_llvm_build.c index b53d9c7ff8c..667f9700764 1

[Mesa-dev] [PATCH v2 14/41] ac/nir: make ac_build_fdiv support 16-bit floats

2019-02-15 Thread Rhys Perry
v2: don't use ac_get_onef() Signed-off-by: Rhys Perry --- src/amd/common/ac_llvm_build.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/amd/common/ac_llvm_build.c b/src/amd/common/ac_llvm_build.c index 23e454385d7..fb871a47400 100644 --- a/src/amd/common/ac_llvm_bu

[Mesa-dev] [PATCH v2 21/41] ac/nir: implement 16-bit shifts

2019-02-15 Thread Rhys Perry
Signed-off-by: Rhys Perry --- src/amd/common/ac_nir_to_llvm.c | 9 +++-- 1 file changed, 3 insertions(+), 6 deletions(-) diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c index 75bb19031bf..bad1c2a990e 100644 --- a/src/amd/common/ac_nir_to_llvm.c +++ b/src/amd

[Mesa-dev] [PATCH v2 11/41] ac/nir: make ac_build_fract work on all bit sizes

2019-02-15 Thread Rhys Perry
Signed-off-by: Rhys Perry --- src/amd/common/ac_llvm_build.c | 13 +++-- 1 file changed, 3 insertions(+), 10 deletions(-) diff --git a/src/amd/common/ac_llvm_build.c b/src/amd/common/ac_llvm_build.c index 667f9700764..db937eb66fb 100644 --- a/src/amd/common/ac_llvm_build.c +++ b/src/amd

[Mesa-dev] [PATCH v2 16/41] ac/nir: implement half-float nir_op_frsq

2019-02-15 Thread Rhys Perry
v2: don't use ac_get_onef() Signed-off-by: Rhys Perry --- src/amd/common/ac_nir_to_llvm.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c index cba0cec3e8f..8b0e07d2930 100644 --- a/src/amd/c

[Mesa-dev] [PATCH v2 15/41] ac/nir: implement half-float nir_op_frcp

2019-02-15 Thread Rhys Perry
v2: don't use ac_get_onef() Signed-off-by: Rhys Perry --- src/amd/common/ac_nir_to_llvm.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c index 741059b5f1a..cba0cec3e8f 100644 --- a/src/amd/c

[Mesa-dev] [PATCH v2 18/41] radv: lower 16-bit flrp

2019-02-15 Thread Rhys Perry
Signed-off-by: Rhys Perry --- src/amd/vulkan/radv_shader.c | 1 + 1 file changed, 1 insertion(+) diff --git a/src/amd/vulkan/radv_shader.c b/src/amd/vulkan/radv_shader.c index 1dcb0606246..adba730ad8b 100644 --- a/src/amd/vulkan/radv_shader.c +++ b/src/amd/vulkan/radv_shader.c @@ -53,6 +53,7

[Mesa-dev] [PATCH v2 12/41] ac/nir: make ac_build_isign work on all bit sizes

2019-02-15 Thread Rhys Perry
v2: don't use ac_get_zero(), ac_get_one() and ac_int_of_size() Signed-off-by: Rhys Perry --- src/amd/common/ac_llvm_build.c | 27 --- 1 file changed, 4 insertions(+), 23 deletions(-) diff --git a/src/amd/common/ac_llvm_build.c b/src/amd/common/ac_llvm_build.c

[Mesa-dev] [PATCH v2 19/41] ac/nir: support half floats in emit_b2f

2019-02-15 Thread Rhys Perry
This seems to generate fine code, even though the IR is a bit ugly. Signed-off-by: Rhys Perry --- src/amd/common/ac_nir_to_llvm.c | 14 ++ 1 file changed, 10 insertions(+), 4 deletions(-) diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c index

[Mesa-dev] [PATCH v2 40/41] ac/nir: have nir_op_f2f16 round to zero

2019-02-15 Thread Rhys Perry
In the hope that one day LLVM will then be able to generate code with vectorized v_cvt_pkrtz_f16_f32 instructions. Signed-off-by: Rhys Perry --- src/amd/common/ac_nir_to_llvm.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common

[Mesa-dev] [PATCH v2 37/41] radv, ac: implement 16-bit interpolation

2019-02-15 Thread Rhys Perry
v2: add to patch series Signed-off-by: Rhys Perry --- src/amd/common/ac_llvm_build.c | 33 +--- src/amd/common/ac_llvm_build.h | 3 ++- src/amd/common/ac_nir_to_llvm.c | 14 +++--- src/amd/vulkan/radv_nir_to_llvm.c| 27

[Mesa-dev] [PATCH v2 38/41] WIP: ac, radv: run LLVM's SLP vectorizer

2019-02-15 Thread Rhys Perry
v2: rebase v2: move LLVMAddSLPVectorizePass to after LLVMAddEarlyCSEMemSSAPass v2: run unconditionally on GFX9 and later v2: mark as WIP because it can make 32-bit code much worse Signed-off-by: Rhys Perry --- src/amd/common/ac_llvm_util.c | 8 ++-- 1 file changed, 6 insertions(+), 2

[Mesa-dev] [PATCH v2 39/41] ac/nir: generate better code for nir_op_f2f16_rtz

2019-02-15 Thread Rhys Perry
Signed-off-by: Rhys Perry --- src/amd/common/ac_nir_to_llvm.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c index 939b8eb13de..8bfc63958ca 100644 --- a/src/amd/common/ac_nir_to_llvm.c +++ b/src/amd/common

[Mesa-dev] [PATCH v2 41/41] radv, docs: expose float16, int16 and int8 features and extensions

2019-02-15 Thread Rhys Perry
v2: rebase v2: mark VK_KHR_8bit_storage as DONE in features.txt Signed-off-by: Rhys Perry --- docs/features.txt | 2 +- src/amd/vulkan/radv_device.c | 17 + src/amd/vulkan/radv_extensions.py | 4 src/amd/vulkan/radv_shader.c | 3 +++ 4 files

[Mesa-dev] [PATCH v2 37/41] WIP: radv, ac: implement 16-bit interpolation

2019-02-15 Thread Rhys Perry
v2: add to patch series Signed-off-by: Rhys Perry --- src/amd/common/ac_llvm_build.c | 33 +--- src/amd/common/ac_llvm_build.h | 3 ++- src/amd/common/ac_nir_to_llvm.c | 14 +++--- src/amd/vulkan/radv_nir_to_llvm.c| 27

[Mesa-dev] [PATCH v2 34/41] ac/nir: store all outputs as f32

2019-02-15 Thread Rhys Perry
v2: rebase v2: fix 64-bit visit_load_var() Signed-off-by: Rhys Perry --- src/amd/common/ac_nir_to_llvm.c | 14 ++ src/amd/vulkan/radv_nir_to_llvm.c | 22 +- 2 files changed, 19 insertions(+), 17 deletions(-) diff --git a/src/amd/common/ac_nir_to_llvm.c b/src

Re: [Mesa-dev] [PATCH v2 37/41] radv, ac: implement 16-bit interpolation

2019-02-15 Thread Rhys Perry
This patch can be ignored. I forgot to delete it and it ended up getting sent. "[PATCH v2 37/41] WIP: radv, ac: implement 16-bit interpolation" is the correct one. On Sat, 16 Feb 2019 at 00:23, Rhys Perry wrote: > > v2: add to patch series > > Signed-off-by: Rhys Perry &

[Mesa-dev] [PATCH v2 35/41] radv: store all fragment shader inputs as f32

2019-02-15 Thread Rhys Perry
v2: rebase Signed-off-by: Rhys Perry --- src/amd/vulkan/radv_nir_to_llvm.c | 14 -- 1 file changed, 4 insertions(+), 10 deletions(-) diff --git a/src/amd/vulkan/radv_nir_to_llvm.c b/src/amd/vulkan/radv_nir_to_llvm.c index 2002a744545..01b8b097ea1 100644 --- a/src/amd/vulkan

[Mesa-dev] [PATCH v2 33/41] ac/nir, radv: create an array of varying output types

2019-02-15 Thread Rhys Perry
Signed-off-by: Rhys Perry --- src/amd/common/ac_nir_to_llvm.c | 68 +++ src/amd/common/ac_shader_abi.h| 1 + src/amd/vulkan/radv_nir_to_llvm.c | 3 ++ 3 files changed, 72 insertions(+) diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common

[Mesa-dev] [PATCH v2 25/41] nir: make bitfield_reverse and ifind_msb work with all integers

2019-02-15 Thread Rhys Perry
Signed-off-by: Rhys Perry --- src/compiler/nir/nir_opcodes.py | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/compiler/nir/nir_opcodes.py b/src/compiler/nir/nir_opcodes.py index dc4cd9ac63d..0f40bd6c548 100644 --- a/src/compiler/nir/nir_opcodes.py +++ b/src/compiler

[Mesa-dev] [PATCH v2 32/41] ac/nir: add 8-bit types to glsl_base_to_llvm_type

2019-02-15 Thread Rhys Perry
v2: remove 16-bit additions and rebase Signed-off-by: Rhys Perry --- src/amd/common/ac_nir_to_llvm.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c index f6ad1aa7e77..defbfdf4297 100644 --- a/src/amd/common

[Mesa-dev] [PATCH v2 31/41] ac/nir: implement 16-bit pack/unpack opcodes

2019-02-15 Thread Rhys Perry
Signed-off-by: Rhys Perry --- src/amd/common/ac_nir_to_llvm.c | 24 1 file changed, 24 insertions(+) diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c index bad1c2a990e..f6ad1aa7e77 100644 --- a/src/amd/common/ac_nir_to_llvm.c +++ b/src/amd

Re: [Mesa-dev] [PATCH v2 06/41] ac/nir: fix 16-bit ssbo stores

2019-02-18 Thread Rhys Perry
s this fix anything know? There is a 16-bit version of tbuffer.store, > maybe we should use it? > > On 2/16/19 1:21 AM, Rhys Perry wrote: > > Signed-off-by: Rhys Perry > > --- > > src/amd/common/ac_nir_to_llvm.c | 2 ++ > > 1 file changed, 2 insertions(+) >

Re: [Mesa-dev] [PATCH 00/38] radv, ac: 16-bit and 8-bit arithmetic and 8-bit storage

2019-02-18 Thread Rhys Perry
dle 64-bit varyings. So not all of them would work even if VK_FORMAT_R64_SFLOAT was a implemented vertex format. On Mon, 18 Feb 2019 at 08:53, Samuel Pitoiset wrote: > > > On 2/16/19 1:21 AM, Rhys Perry wrote: > > This series add support for: > > - VK_KHR_shader_float16_int8 >

[Mesa-dev] [PATCH] nv50/ir, nvc0: add debug options for shader replacement

2018-05-29 Thread Rhys Perry
READ_PATH expect using CRC-32 checksums instead of program IDs and chip-specific binaries instead of GLSL. Signed-off-by: Rhys Perry --- src/gallium/auxiliary/tgsi/tgsi_util.h | 1 + src/gallium/drivers/nouveau/Makefile.sources | 2 + src/gallium/drivers/nouveau

Re: [Mesa-dev] [PATCH v2 4/5] nvc0: add support for programmable sample locations

2018-05-29 Thread Rhys Perry
less. On Mon, May 28, 2018 at 9:05 PM, Ilia Mirkin wrote: > ARB_sample_locaitons has all this stuff about a resolve of some sort > when you switch around the locations. I don't see anything here about > that. Thoughts? > > Also some more specific comments inline: >

[Mesa-dev] [PATCH v5] nv50/ir, nvc0: add debug options for shader replacement

2018-05-30 Thread Rhys Perry
eaders. This is all much like MESA_SHADER_DUMP_PATH and MESA_SHADER_READ_PATH expect using CRC-32 checksums instead of program IDs and chip-specific binaries instead of GLSL. Signed-off-by: Rhys Perry --- src/gallium/auxiliary/tgsi/tgsi_util.h | 1 + src/gallium/drivers/nouveau/Makefi

[Mesa-dev] [PATCH v3 0/5] Implement ARB_sample_locations for nvc0

2018-06-01 Thread Rhys Perry
o the feature is available on ES - decouple framebuffer and sample location state in the state tracker and nvc0 - rebase to upstream master Rhys Perry (5): mesa: add support for ARB_sample_locations gallium: add support for programmable sample locations st/mesa: add support for ARB_sample_loca

[Mesa-dev] [PATCH v3 2/5] gallium: add support for programmable sample locations

2018-06-01 Thread Rhys Perry
Signed-off-by: Rhys Perry Reviewed-by: Brian Paul (v2) Reviewed-by: Marek Olšák (v2) --- src/gallium/auxiliary/util/u_framebuffer.c | 30 + src/gallium/auxiliary/util/u_framebuffer.h | 5 +++ src/gallium/docs/source/context.rst | 14 src

[Mesa-dev] [PATCH v3 1/5] mesa: add support for ARB_sample_locations

2018-06-01 Thread Rhys Perry
Signed-off-by: Rhys Perry Reviewed-by: Brian Paul (v2) Reviewed-by: Marek Olšák (v2) --- src/mapi/glapi/gen/gl_API.xml | 104 + src/mesa/main/config.h | 9 ++ src/mesa/main/dd.h | 8 + src/mesa/main/extensions_table.h| 2

[Mesa-dev] [PATCH v3 3/5] st/mesa: add support for ARB_sample_locations

2018-06-01 Thread Rhys Perry
Signed-off-by: Rhys Perry Reviewed-by: Brian Paul (v2) Reviewed-by: Marek Olšák (v2) --- src/mesa/state_tracker/st_atom.h | 2 +- src/mesa/state_tracker/st_atom_list.h | 2 +- src/mesa/state_tracker/st_atom_msaa.c | 77 +- src/mesa/state_tracker

[Mesa-dev] [PATCH v3 4/5] nvc0: add support for programmable sample locations

2018-06-01 Thread Rhys Perry
Signed-off-by: Rhys Perry --- .../drivers/nouveau/codegen/nv50_ir_driver.h | 2 + .../drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp | 7 + .../nouveau/codegen/nv50_ir_lowering_nvc0.cpp | 102 -- .../nouveau/codegen/nv50_ir_lowering_nvc0.h| 2 + src/gallium

[Mesa-dev] [PATCH v3 5/5] docs: document addition of GL_ARB_sample_locations for nvc0

2018-06-01 Thread Rhys Perry
Signed-off-by: Rhys Perry Reviewed-by: Brian Paul (v2) --- docs/features.txt | 2 +- docs/relnotes/18.2.0.html | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/features.txt b/docs/features.txt index e786bbecf4..2eac14fb32 100644 --- a/docs/features.txt +++ b

[Mesa-dev] [PATCH] nv50/ir: fix image stores with indirect handles

2018-06-05 Thread Rhys Perry
Having this if statement here prevented the next if statement from being reached in the case of image stores, which is needed for instructions with indirect bindless handles like "STORE TEMP[ADDR[2].x+1](1) ...". Signed-off-by: Rhys Perry --- src/gallium/drivers/nouve

[Mesa-dev] [PATCH 0/6] Fix Various Compilation Issues With Bindless

2018-06-06 Thread Rhys Perry
MP2HND and IMG2HND - IMG2HND with Kepler is not implemented Usage of bound handles as l-values and casting them is handled better than before though. Some tests for these changes have been posted on the piglit mailing list. Rhys Perry (6): gallium: add new SAMP2HND and IMG2HND opcodes nv50/ir: a

[Mesa-dev] [PATCH 6/6] glsl: fix function inlining with opaque parameters

2018-06-06 Thread Rhys Perry
Signed-off-by: Rhys Perry --- src/compiler/glsl/opt_function_inlining.cpp | 52 - 1 file changed, 44 insertions(+), 8 deletions(-) diff --git a/src/compiler/glsl/opt_function_inlining.cpp b/src/compiler/glsl/opt_function_inlining.cpp index 04690b6cf4..52f57da936

[Mesa-dev] [PATCH 4/6] glsl: allow ?: operator with images and samplers when bindless is enabled

2018-06-06 Thread Rhys Perry
Signed-off-by: Rhys Perry --- src/compiler/glsl/ast_to_hir.cpp | 8 +--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/src/compiler/glsl/ast_to_hir.cpp b/src/compiler/glsl/ast_to_hir.cpp index 3bf581571e..8a7dd62506 100644 --- a/src/compiler/glsl/ast_to_hir.cpp +++ b/src

[Mesa-dev] [PATCH 2/6] nv50/ir: add support for SAMP2HND on gk104+ and IMG2HND on gm107+

2018-06-06 Thread Rhys Perry
Signed-off-by: Rhys Perry --- src/gallium/drivers/nouveau/codegen/nv50_ir.cpp| 2 ++ src/gallium/drivers/nouveau/codegen/nv50_ir.h | 2 ++ .../drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp | 22 +++ .../drivers/nouveau/codegen/nv50_ir_inlines.h | 4

[Mesa-dev] [PATCH 5/6] glsl, glsl_to_tgsi: fix sampler/image constants

2018-06-06 Thread Rhys Perry
Signed-off-by: Rhys Perry --- src/compiler/glsl/ir.cpp | 32 -- src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 14 ++--- 2 files changed, 41 insertions(+), 5 deletions(-) diff --git a/src/compiler/glsl/ir.cpp b/src/compiler/glsl/ir.cpp index

[Mesa-dev] [PATCH 3/6] glsl_to_tgsi: allow bound samplers and images to be used as l-values

2018-06-06 Thread Rhys Perry
Signed-off-by: Rhys Perry --- src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 55 +++- src/mesa/state_tracker/st_glsl_to_tgsi_private.h | 1 + 2 files changed, 55 insertions(+), 1 deletion(-) diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp b/src/mesa

[Mesa-dev] [PATCH 1/6] gallium: add new SAMP2HND and IMG2HND opcodes

2018-06-06 Thread Rhys Perry
This commit does not add support for the opcodes in gallivm or tgsi_to_nir.c Signed-off-by: Rhys Perry --- src/gallium/auxiliary/tgsi/tgsi_info.c | 2 ++ src/gallium/auxiliary/tgsi/tgsi_info_opcodes.h | 4 ++-- src/gallium/auxiliary/tgsi/tgsi_opcode_tmp.h | 3 +++ src/gallium/docs

Re: [Mesa-dev] [PATCH 0/6] Fix Various Compilation Issues With Bindless

2018-06-06 Thread Rhys Perry
Oops, I meant r-values, not l-values. Seems to meaning of the word in my head changed at some point. On Wed, Jun 6, 2018 at 8:55 PM, Rhys Perry wrote: > Previously, there were some errors in the compiler's implementation of > ARB_bindless_texture, mostly related to usage of bou

Re: [Mesa-dev] [PATCH 00/16] Move the Mesa Website to Sphinx

2018-06-08 Thread Rhys Perry
Might be good to do something like this: https://codepen.io/anon/pen/ERNdYJ So that those with NoScript or something won't have gears constantly rotating on their screen. On Fri, Jun 8, 2018 at 2:25 PM, Erik Faye-Lund wrote: > On Fri, Jun 8, 2018 at 2:06 PM, Rob Clark wrote: >> On Fri, Jun 8, 20

[Mesa-dev] [PATCH] nv50/ir: Improve performance of signed division by powers of two

2018-06-08 Thread Rhys Perry
Signed-off-by: Rhys Perry --- .../drivers/nouveau/codegen/nv50_ir_peephole.cpp | 29 +++--- 1 file changed, 25 insertions(+), 4 deletions(-) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp index

[Mesa-dev] [PATCH v2] nv50/ir: improve performance of signed division by powers of two

2018-06-09 Thread Rhys Perry
Changes in v2: - Stylistic changes - Use OP_SLCT instead of OP_SELP which only worked by luck - Fix issues in edge cases Signed-off-by: Rhys Perry --- .../drivers/nouveau/codegen/nv50_ir_peephole.cpp | 30 +++--- 1 file changed, 26 insertions(+), 4 deletions(-) diff --git a

[Mesa-dev] [PATCH] nv50/ir: fix TargetNVC0::insnCanLoadOffset()

2018-06-11 Thread Rhys Perry
Previously, TargetNVC0::insnCanLoadOffset() returned whether the offset could be set to a specific value. The IndirectPropagation pass expected it to return whether the offset could be increased. Signed-off-by: Rhys Perry --- src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp | 1 + 1

[Mesa-dev] [PATCH 0/6] Fix Various Compilation Issues With Bindless

2018-06-11 Thread Rhys Perry
Ping to those who seem appropriate for this patch in case it was forgotten or missed. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] nv50/ir: handle SHLADD in IndirectPropagation

2018-06-11 Thread Rhys Perry
hurt 0 0 0 0 0 Signed-off-by: Rhys Perry --- src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp | 14 +- 1 file changed, 13 insertions(+), 1 deletion(-) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp

[Mesa-dev] [PATCH v2 0/2] nv50/ir: SHLADD related improvements

2018-06-12 Thread Rhys Perry
helped 0 0 7120162016 hurt 0 0 52 19 19 Rhys Perry (2): nv50/ir: handle SHLADD in IndirectPropagation nv50/ir: move LateAlgebraicOpt back to right after ConstantFolding src/gallium/d

[Mesa-dev] [PATCH v2 2/2] nv50/ir: move LateAlgebraicOpt back to right after ConstantFolding

2018-06-12 Thread Rhys Perry
32 32 Signed-off-by: Rhys Perry --- src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp index

[Mesa-dev] [PATCH v2 1/2] nv50/ir: handle SHLADD in IndirectPropagation

2018-06-12 Thread Rhys Perry
0 Signed-off-by: Rhys Perry --- src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp | 12 1 file changed, 12 insertions(+) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp index 39177bd044..83fb1

[Mesa-dev] [PATCH 1/4] nv50/ir: add preliminary support for OP_XMAD

2018-06-13 Thread Rhys Perry
Signed-off-by: Rhys Perry --- src/gallium/drivers/nouveau/codegen/nv50_ir.cpp| 3 ++- src/gallium/drivers/nouveau/codegen/nv50_ir.h | 14 .../drivers/nouveau/codegen/nv50_ir_peephole.cpp | 12 +-- .../drivers/nouveau/codegen/nv50_ir_print.cpp | 20

[Mesa-dev] [PATCH 4/4] nv50/ir: further optimize multiplication by immediates

2018-06-13 Thread Rhys Perry
74 23 23 Signed-off-by: Rhys Perry --- .../drivers/nouveau/codegen/nv50_ir_peephole.cpp | 123 ++--- src/util/bitscan.h | 26 + 2 files changed, 135 insertions(+), 14 deletions(-) diff --git a/src/gallium/dri

[Mesa-dev] [PATCH 3/4] nv50/ir: optimize imul/imad to xmads

2018-06-13 Thread Rhys Perry
t bytes helped 0 0 39 0 0 hurt 1 0 33422772277 Signed-off-by: Rhys Perry --- .../drivers/nouveau/codegen/nv50_ir_peephole.cpp | 53 ++ 1 file changed, 53 insertions(+) diff

[Mesa-dev] [PATCH 0/4] nv50/ir: Improve Performance of Integer Multiplication

2018-06-13 Thread Rhys Perry
in shared programs : 360704 -> 360704 (0.00%) total local used in shared programs : 20952 -> 20952 (0.00%) local sharedgpr inst bytes helped 0 0 255 680 680 hurt 0 0 128

[Mesa-dev] [PATCH 2/4] gm107/ir: add support for OP_XMAD on GM107+

2018-06-13 Thread Rhys Perry
Signed-off-by: Rhys Perry --- .../drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp | 61 ++ .../nouveau/codegen/nv50_ir_target_gm107.cpp | 6 ++- .../nouveau/codegen/nv50_ir_target_nvc0.cpp| 1 + 3 files changed, 67 insertions(+), 1 deletion(-) diff --git a/src

[Mesa-dev] [PATCH 0/4] nv50/ir: Improve Performance of Integer Multiplication

2018-06-13 Thread Rhys Perry
Forgot to CC you. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 2/5] mesa, glsl: add support for EXT_shader_image_load_formatted

2018-06-15 Thread Rhys Perry
Signed-off-by: Rhys Perry --- src/compiler/glsl/ast_to_hir.cpp | 5 + src/compiler/glsl/glsl_parser_extras.cpp | 1 + src/compiler/glsl/glsl_parser_extras.h | 7 +++ src/mesa/main/extensions_table.h | 1 + src/mesa/main/mtypes.h | 1 + 5 files changed

  1   2   3   4   >