Re: [Mesa-dev] [PATCH 09/10] radeonsi: don't emit AMDGPU intrinsics for RSQ opcodes

2015-10-11 Thread Matt Arsenault
> On Oct 10, 2015, at 6:29 PM, Marek Olšák wrote: > > +/* This requires "unsafe-fp-math" for LLVM to convert it to RSQ. */ > +static void emit_rsq(const struct lp_build_tgsi_action *action, > + struct lp_build_tgsi_context *bld_base, > + struct lp_build_emit_dat

Re: [Mesa-dev] [PATCH 07/10] radeonsi: don't use the AMDGPU intrinsic for CMP

2015-10-11 Thread Matt Arsenault
> On Oct 10, 2015, at 6:29 PM, Marek Olšák wrote: > > The increase in VGPRs in unfortunate, but the decrease in the scratch size > is always welcome. Do you have a specific example where this happens you can post?___ mesa-dev mailing list mesa-dev@lis

Re: [Mesa-dev] [PATCH] Revert "radeon/llvm: enable unsafe math for graphics shaders"

2015-02-18 Thread Matt Arsenault
> On Feb 17, 2015, at 11:52 PM, Grigori Goronzy wrote: > > Hi, > > AFAIR not enabling this makes LLVM generate really slow code in some > common cases. Maybe this is just a bug in LLVM/R600 triggered by unsafe > FP math optimization or some optimization is too eager. Other drivers do > fine wit

Re: [Mesa-dev] [PATCH] Revert "radeon/llvm: enable unsafe math for graphics shaders"

2015-02-18 Thread Matt Arsenault
> On Feb 18, 2015, at 1:15 AM, Michel Dänzer wrote: > > On 18.02.2015 17:13, Michel Dänzer wrote: >> On 18.02.2015 16:52, Grigori Goronzy wrote: >>> >>> What's the impact on performance with unsafe FP math disabled at this time? >> >> I don't know. Correctness trumps performance. > > FWIW, I

Re: [Mesa-dev] [PATCH 2/3] clover: Enable cl_khr_fp64 for devices that support doubles v2

2015-02-26 Thread Matt Arsenault
> On Feb 26, 2015, at 5:06 PM, Tom Stellard wrote: > > v2: > - Report correct values for CL_DEVICE_NATIVE_VECTOR_WIDTH_DOUBLE and >CL_DEVICE_PREFERRED_VECTOR_WIDTH_DOUBLE. > - Only define cl_khr_fp64 if the extension is supported. > - Remove trailing space from extension string. > - Rena

Re: [Mesa-dev] [PATCH 2/9] radeonsi: use V_BFE for extracting a sample index

2015-03-02 Thread Matt Arsenault
> On Mar 2, 2015, at 1:19 PM, Tom Stellard wrote: > > On Mon, Mar 02, 2015 at 10:14:00PM +0100, Marek Olšák wrote: >> On Mon, Mar 2, 2015 at 10:05 PM, Tom Stellard wrote: >>> On Mon, Mar 02, 2015 at 12:54:16PM +0100, Marek Olšák wrote: From: Marek Olšák --- src/gallium/dri

Re: [Mesa-dev] [PATCH 2/9] radeonsi: use V_BFE for extracting a sample index

2015-03-05 Thread Matt Arsenault
> On Mar 5, 2015, at 6:50 AM, Tom Stellard wrote: > > On Mon, Mar 02, 2015 at 02:09:29PM -0800, Matt Arsenault wrote: >> >>> On Mar 2, 2015, at 1:19 PM, Tom Stellard wrote: >>> >>> On Mon, Mar 02, 2015 at 10:14:00PM +0100, Marek Olšák wrote: >>

Re: [Mesa-dev] [PATCH] clover: Return the minimum required value for CL_DEVICE_SINGLE_FP_CONFIG

2015-03-05 Thread Matt Arsenault
> On Mar 5, 2015, at 10:42 AM, Francisco Jerez wrote: > > Could you add that this is according to the OpenCL 1.1 specification? > OpenCL 1.2 is even weaker (CL_FP_INF_NAN is not required, only one of > CL_FP_ROUND_TO_ZERO or CL_FP_ROUND_TO_NEAREST is required, and no FP > capabilities at all are

Re: [Mesa-dev] [PATCH] clover: Return the minimum required value for CL_DEVICE_SINGLE_FP_CONFIG

2015-03-06 Thread Matt Arsenault
> On Mar 6, 2015, at 8:56 AM, Francisco Jerez wrote: > > Tom Stellard mailto:t...@stellard.net>> writes: > >> On Thu, Mar 05, 2015 at 08:42:25PM +0200, Francisco Jerez wrote: >>> Tom Stellard writes: >>> This means dropping CL_FP_DENORM from the current return value. --- src/ga

Re: [Mesa-dev] GLSL IR & TGSI on-disk shader cache

2017-02-13 Thread Matt Arsenault
> On Feb 6, 2017, at 19:42, Timothy Arceri wrote: > > This series does not include the patch that adds cache support > to the radeonsi backend, the main reason for this is that llvm > currently doesn't allow the version to be queried at runtime > (as far as I'm aware) although it seems like othe

Re: [Mesa-dev] [PATCH] radeonsi: allow unaligned vertex buffer offsets and strides on CIK-VI

2017-02-14 Thread Matt Arsenault
> On Feb 13, 2017, at 09:01, Marek Olšák wrote: > > So that we can disable u_vbuf for GL core profiles. > > This is a v2 of the previous VI-only patch. > It requires SH_MEM_CONFIG.ALIGNMENT_MODE = UNALIGNED on CIK-VI. Is this enabled? I wasn’t sure, so currently LLVM assumes no. You can start

Re: [Mesa-dev] [PATCH 1/1] clover: Dump linked module to a different file

2017-02-22 Thread Matt Arsenault
> On Feb 22, 2017, at 07:51, Jan Vesely wrote: > > This allows to pass the generated files directly to llc or bugpoint. > Note that if program links multiple binaries they will still be in the same > file, the module name is "link”. Can you add a counter ID or something to ensure unique files?

Re: [Mesa-dev] [PATCH] radv/ac: enable loop unrolling.

2017-02-23 Thread Matt Arsenault
> On Feb 23, 2017, at 19:27, Dave Airlie wrote: > > +static void set_unroll_metadata(struct nir_to_llvm_context *ctx, > +LLVMValueRef br) > +{ > + unsigned kind = LLVMGetMDKindIDInContext(ctx->context, "llvm.loop", 9); > + LLVMValueRef md_unroll; > + LLVMV

Re: [Mesa-dev] [PATCH] radv/ac: enable loop unrolling.

2017-02-23 Thread Matt Arsenault
> On Feb 23, 2017, at 19:44, Dave Airlie wrote: > > On 24 February 2017 at 13:36, Matt Arsenault <mailto:arse...@gmail.com>> wrote: >> >> On Feb 23, 2017, at 19:27, Dave Airlie wrote: >> >> +static void set_unroll_metadata(struct nir_to_llvm

Re: [Mesa-dev] [PATCH] radv/ac: enable loop unrolling.

2017-02-24 Thread Matt Arsenault
> On Feb 24, 2017, at 01:45, Marek Olšák wrote: > > The main requirement is that if there is indirect indexing inside a > loop, we always want to unroll the whole loop to get rid of the > indexing, which can decrease scratch usage. > > Marek We boost the unroll thresholds when there is private

Re: [Mesa-dev] [PATCH] radv/ac: enable loop unrolling.

2017-02-24 Thread Matt Arsenault
> On Feb 24, 2017, at 14:39, Marek Olšák wrote: > > On Fri, Feb 24, 2017 at 7:20 PM, Matt Arsenault wrote: >> >> On Feb 24, 2017, at 01:45, Marek Olšák wrote: >> >> The main requirement is that if there is indirect indexing inside a >> loop, we alwa

Re: [Mesa-dev] [PATCH 10/24] radeonsi: replace SI.packf16 with amdgcn.cvt.pkrtz

2017-02-25 Thread Matt Arsenault
> On Feb 25, 2017, at 15:58, Marek Olšák wrote: > > } > + > +LLVMValueRef ac_emit_cvt_pkrtz_f16(struct ac_llvm_context *ctx, > +LLVMValueRef args[2]) > +{ > + if (HAVE_LLVM >= 0x0500) { > + LLVMTypeRef v2f16 = > + LLVMVectorType

Re: [Mesa-dev] [PATCH 2/5] radeonsi: set dereferenceable attribute on descriptor arrays

2016-07-13 Thread Matt Arsenault
> On Jul 13, 2016, at 12:36, Marek Olšák wrote: > > On Wed, Jul 13, 2016 at 9:25 PM, Tom Stellard > wrote: >> On Wed, Jul 13, 2016 at 03:20:55PM -0400, Tom Stellard wrote: >>> On Tue, Jul 12, 2016 at 10:52:35PM +0200, Marek Olšák wrote: From: Marek Olšák >>

Re: [Mesa-dev] Mesa (master): Revert "radeon/llvm: Use alloca instructions for larger arrays"

2016-07-21 Thread Matt Arsenault
> On Jul 21, 2016, at 01:03, Michel Dänzer wrote: > > On 21.07.2016 00:04, Michel Dänzer wrote: >> On 15.07.2016 05:15, Marek =?UNKNOWN?B?T2zFocOhaw==?= wrote: >>> Module: Mesa >>> Branch: master >>> Commit: f84e9d749fbb6da73a60fb70e6725db773c9b8f8 >>> URL: >>> http://cgit.freedesktop.org/me

Re: [Mesa-dev] Mesa (master): Revert "radeon/llvm: Use alloca instructions for larger arrays"

2016-07-26 Thread Matt Arsenault
> On Jul 26, 2016, at 14:37, Marek Olšák wrote: > > On Sat, Jul 23, 2016 at 4:07 PM, Nicolai Hähnle <mailto:nhaeh...@gmail.com>> wrote: >> On 22.07.2016 12:08, Michel Dänzer wrote: >>> >>> On 21.07.2016 18:17, Matt Arsenault wrote: >>>

Re: [Mesa-dev] PATCH: R600 + SI Private memory fixes; Use more SALU instructions on SI

2013-10-10 Thread Matt Arsenault
On 10/10/2013 10:55 AM, Tom Stellard wrote: Hi, The attached patches simplify the handling of OpenCL private memory space for VLIW4/VLIW5 GPUs and should fix a crash with pyrit on r600g. Also included in the series is private memory support on SI as well as an optimization to prefer selecting SA

Re: [Mesa-dev] [PATCH] R600/SI: add Gather4 intrinsics

2014-06-08 Thread Matt Arsenault
On 06/06/2014 02:57 PM, Marek Olšák wrote: DMASK was repurposed for GATHER4, so all passes which modify DMASK are disabled by setting MIMG=0 and hasPostISelHook=0. See my Mesa patches for how DMASK works with GATHER4, because this is not documented anywhere. Can you add a comment explaining thi

Re: [Mesa-dev] [PATCH 1/2] R600/SI: add Gather4 intrinsics (v2)

2014-06-16 Thread Matt Arsenault
On 06/16/2014 08:45 AM, Tom Stellard wrote: You don't need to add new SDNodes for all these instructions, you can just use the intrinsic directly in the pattern. The only reason to add SDNodes, is if there are optimizations / special lowering we can do for these instructions. I kind of like hav

Re: [Mesa-dev] [PATCH 5/5] clover: Enable cl_khr_fp64 for devices that support doubles

2014-06-17 Thread Matt Arsenault
On Jun 17, 2014, at 3:11 PM, Bruno Jimenez wrote: > Hi, > > I have a couple of questions about this patch: > > 1) Could you please also change how the results of the > 'CL_DEVICE_NATIVE_VECTOR_WIDTH_DOUBLE' and > 'CL_DEVICE_PREFERRED_VECTOR_WIDTH_DOUBLE' queries are generated? > According to t

Re: [Mesa-dev] [PATCH] radeon/llvm: Adapt to AMDGPU.rsq intrinsic change in LLVM 3.5

2014-06-19 Thread Matt Arsenault
On Jun 18, 2014, at 11:53 PM, Michel Dänzer wrote: > From: Michel Dänzer > > Signed-off-by: Michel Dänzer > --- > src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c | 4 > 1 file changed, 4 insertions(+) > > diff --git a/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c > b/src/galli

Re: [Mesa-dev] [PATCH 5/5] clover: Enable cl_khr_fp64 for devices that support doubles

2014-06-21 Thread Matt Arsenault
On Jun 21, 2014, at 9:37 AM, Francisco Jerez wrote: > Tom Stellard writes: > >> --- >> src/gallium/state_trackers/clover/api/device.cpp | 4 +++- >> src/gallium/state_trackers/clover/core/device.cpp | 6 ++ >> src/gallium/state_trackers/clover/core/device.hpp | 1 + >> 3 files changed, 10 in

Re: [Mesa-dev] [PATCH 5/5] clover: Enable cl_khr_fp64 for devices that support doubles

2014-06-21 Thread Matt Arsenault
On Jun 21, 2014, at 12:32 PM, Francisco Jerez wrote: > Matt Arsenault writes: > >> On Jun 21, 2014, at 9:37 AM, Francisco Jerez wrote: >> >>> Tom Stellard writes: >>>> [...] >>>> case CL_DEVICE_EXTENSIONS: >>>> - buf.as_

Re: [Mesa-dev] [PATCH 1/2] clover: Report a default value for CL_DEVICE_SINGLE_FP_CONFIG

2014-07-02 Thread Matt Arsenault
On Jul 2, 2014, at 12:48 PM, Tom Stellard wrote: > --- > src/gallium/state_trackers/clover/api/device.cpp | 3 +-- > src/gallium/state_trackers/clover/core/device.cpp | 6 ++ > src/gallium/state_trackers/clover/core/device.hpp | 1 + > 3 files changed, 8 insertions(+), 2 deletions(-) > > diff

Re: [Mesa-dev] [PATCH 1/2] clover: Report a default value for CL_DEVICE_SINGLE_FP_CONFIG

2014-07-02 Thread Matt Arsenault
On Jul 2, 2014, at 12:52 PM, Matt Arsenault wrote: > > On Jul 2, 2014, at 12:48 PM, Tom Stellard wrote: > >> --- >> src/gallium/state_trackers/clover/api/device.cpp | 3 +-- >> src/gallium/state_trackers/clover/core/device.cpp | 6 ++ >> src/gallium/stat

Re: [Mesa-dev] [PATCH] R600/SI: Use i32 vectors for resources and samplers

2014-07-07 Thread Matt Arsenault
On Jul 7, 2014, at 8:28 AM, Marek Olšák wrote: > From: Marek Olšák > > This affects new intrinsics only. > > What surprises me is that v32i8 still works. > --- > lib/Target/R600/SIInstructions.td | 4 +- > lib/Target/R600/SIIntrinsics.td | 6 +-- > test/CodeGen/R600/llvm

Re: [Mesa-dev] [PATCH 1/1] r600: Use llvm intrinsic to read work dimension information

2014-07-30 Thread Matt Arsenault
On Jul 30, 2014, at 4:11 PM, Jan Vesely wrote: > +define i32 @get_work_dim() nounwind readnone alwaysinline { > + %x = call i32 @llvm.r600.read.workdim() nounwind readnone > + ret i32 %x > +} > -- Maybe this should have range metadata attached now that it applies to calls?___

Re: [Mesa-dev] [PATCH 1/1] R600: Add new intrinsic to read work dimensions

2014-07-30 Thread Matt Arsenault
On 07/30/2014 04:11 PM, Jan Vesely wrote: CC: Tom Stellard CC: Matt Arsenault Signed-off-by: Jan Vesely --- include/llvm/IR/IntrinsicsR600.td| 2 ++ lib/Target/R600/R600ISelLowering.cpp | 6 -- 2 files changed, 6 insertions(+), 2 deletions(-) Needs a test for the intrinsic

Re: [Mesa-dev] [PATCH 2/2] r600g: Pass dimension parameter to compute shader.

2014-07-31 Thread Matt Arsenault
On 07/31/2014 03:58 PM, Jan Vesely wrote: Would that work with things like one kernel calling another kernel? If we had a function called from two kernels how would it know where to look? I don't think this case can be handled as 2 separate kernels with the same calling convention. If a kernel i

Re: [Mesa-dev] [PATCH 2/2] R600/SI: FMA is faster than fmul and fadd for f64

2013-08-09 Thread Matt Arsenault
On 08/09/2013 05:59 AM, Niels Ole Salscheider wrote: +bool SITargetLowering::isFMAFasterThanFMulAndFAdd(EVT VT) const { + VT = VT.getScalarType(); + + if (!VT.isSimple()) +return false; + + switch (VT.getSimpleVT().SimpleTy) { + case MVT::f32: +return false; /* There is V_MAD_F32 for

Re: [Mesa-dev] [cfe-dev] 3 element vectors in opencl 1.1+

2014-04-22 Thread Matt Arsenault
On 04/22/2014 02:35 PM, Tom Stellard wrote: On Mon, Apr 21, 2014 at 10:02:27PM -0400, Jan Vesely wrote: Hi, I ran into a problem caused by this part of the OCL specs (6.1.5 Alignment of Types): "For 3-component vector data types, the size of the data type is 4 * sizeof(component)." and the cor

Re: [Mesa-dev] [cfe-dev] 3 element vectors in opencl 1.1+

2014-04-22 Thread Matt Arsenault
On 04/22/2014 05:22 PM, Jan Vesely wrote: On Tue, 2014-04-22 at 14:40 -0700, Matt Arsenault wrote: On 04/22/2014 02:35 PM, Tom Stellard wrote: On Mon, Apr 21, 2014 at 10:02:27PM -0400, Jan Vesely wrote: Hi, I ran into a problem caused by this part of the OCL specs (6.1.5 Alignment of Types

Re: [Mesa-dev] R600 Patches: Add support for the local address space

2013-06-12 Thread Matt Arsenault
On 06/12/2013 05:42 PM, Tom Stellard wrote: Hi, The attached patches add support for local address space on Evergreen / Northern Islands GPUs. Please Review. -Tom > + def int_AMDGPU_barrier_local : Intrinsic<[], [], []>; You probably want to mark this as IntrReadMem to try to avoid reorderi

Re: [Mesa-dev] [PATCH] R600/SI: Custom select 64-bit ADD

2014-02-08 Thread Matt Arsenault
I didn't think to try this. Where is the address folding happening? On 02/07/2014 07:46 AM, Tom Stellard wrote: From: Tom Stellard --- lib/Target/R600/AMDGPUISelDAGToDAG.cpp | 48 ++ lib/Target/R600/SIISelLowering.cpp | 29 lib/Targe

Re: [Mesa-dev] [PATCH] R600/SI: Split global vector loads with more than 4 elements

2014-02-10 Thread Matt Arsenault
Why would you want to do this for the small types? You should be able to load those in fewer loads and then promote them. On 02/10/2014 01:32 PM, Tom Stellard wrote: From: Tom Stellard --- lib/Target/R600/SIISelLowering.cpp | 8 +- test/CodeGen/R600/load.ll | 178 +++

Re: [Mesa-dev] [PATCH] R600/SI: Custom select 64-bit ADD

2014-02-13 Thread Matt Arsenault
On Feb 7, 2014, at 7:46 AM, Tom Stellard wrote: > From: Tom Stellard > > --- > lib/Target/R600/AMDGPUISelDAGToDAG.cpp | 48 ++ > lib/Target/R600/SIISelLowering.cpp | 29 > lib/Target/R600/SIISelLowering.h | 1 - > test/CodeGen/R600/a

Re: [Mesa-dev] [PATCH] R600: Verify all instructions in the AsmPrinter on debug builds

2014-02-25 Thread Matt Arsenault
at 1:54 PM, Tom Stellard wrote: On Tue, Feb 25, 2014 at 01:47:17PM -0800, Matt Arsenault wrote: On 02/25/2014 01:42 PM, Tom Stellard wrote: +errs() << "Please file a bug a bugs.freedesktop.org\n"; Typo, s/a/at/ Thanks, I will fix t

Re: [Mesa-dev] [PATCH] R600: Verify all instructions in the AsmPrinter on debug builds

2014-02-25 Thread Matt Arsenault
On 02/25/2014 01:42 PM, Tom Stellard wrote: +errs() << "Please file a bug a bugs.freedesktop.org\n"; Typo, s/a/at/ ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] radeonsi: force NaNs to 0

2014-12-10 Thread Matt Arsenault
> On Dec 10, 2014, at 5:08 PM, Marek Olšák wrote: > > From: Marek Olšák > > This fixes incorrect rendering in Unreal Engine demos. > I don't know why it's called "dx10 clamp mode". MSDN doesn't mention it. > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=83510 >

Re: [Mesa-dev] [PATCH shader-db] si-report: Track max waves per CU

2016-01-05 Thread Matt Arsenault
> On Jan 5, 2016, at 07:28, Marek Olšák wrote: > > Hi, > > I'd like us to do this computation in Mesa, because it can be more > accurate there. The pixel shader wave count depends heavily on LDS, > because each interpolated input occupies 12 dwords of LDS per > primitive and there can be 1-16 p

Re: [Mesa-dev] [PATCH 2/2] radeonsi: Allow dumping LLVM IR before optimization passes

2016-02-04 Thread Matt Arsenault
> On Feb 4, 2016, at 00:15, Nicolai Hähnle wrote: > > From: Nicolai Hähnle > > Set R600_DEBUG=preoptir to dump the LLVM IR before optimization passes, > to allow diagnosing problems caused by optimization passes. > > Note that in order to compile the resulting IR with llc, you will first > ha

Re: [Mesa-dev] [PATCH 2/3] radeon/llvm: Set the target triple on the module

2016-02-04 Thread Matt Arsenault
> On Feb 4, 2016, at 13:02, Tom Stellard wrote: > > + LLVMSetTarget(ctx->gallivm.module, > + > +#if HAVE_LLVM < 0x0306 > + "r600--"); > +#else > + triple); > +#endif This alone does not set the datalayout, which should also be set here. -Matt

Re: [Mesa-dev] [PATCH] radeonsi: enable denorms for 64-bit and 16-bit floats

2016-02-08 Thread Matt Arsenault
> On Feb 8, 2016, at 08:08, Tom Stellard wrote: > > Do SI/CI support fp64 denorms? If so, won't this hurt performance? > > We should tell the compiler we are enabling fp-64 denorms by adding > +fp64-denormals to the feature string. It would also be better to > read the float_mode value from t

Re: [Mesa-dev] [PATCH] radeonsi: enable denorms for 64-bit and 16-bit floats

2016-02-08 Thread Matt Arsenault
> On Feb 8, 2016, at 08:08, Tom Stellard wrote: > > Do SI/CI support fp64 denorms? If so, won't this hurt performance? This is the only mode that should ever be used. I’m not sure why these are options. There technically are separate flush on input or flush on output options, but I’m not sure

Re: [Mesa-dev] [PATCH] radeonsi: enable denorms for 64-bit and 16-bit floats

2016-02-08 Thread Matt Arsenault
> On Feb 8, 2016, at 12:38, Marek Olšák wrote: > >> >> We should tell the compiler we are enabling fp-64 denorms by adding >> +fp64-denormals to the feature string. It would also be better to >> read the float_mode value from the config registers emitted by the >> compiler. > > Yes, I agree,

Re: [Mesa-dev] [PATCH] radeonsi: enable denorms for 64-bit and 16-bit floats

2016-02-09 Thread Matt Arsenault
> On Feb 9, 2016, at 11:23, Tom Stellard wrote: > > We should still add +fp64-denormals even if the backend doesn't do > anything with it now. This is the default, so it doesn’t really matter anyway. -Matt___ mesa-dev mailing list mesa-dev@lists.free

Re: [Mesa-dev] [PATCH 05/10] clover: Add environment variables for dumping kernel code

2014-10-08 Thread Matt Arsenault
On Oct 6, 2014, at 12:44 PM, Tom Stellard wrote: > --- > .../state_trackers/clover/llvm/invocation.cpp | 74 ++ > 1 file changed, 63 insertions(+), 11 deletions(-) > > diff --git a/src/gallium/state_trackers/clover/llvm/invocation.cpp > b/src/gallium/state_trackers/clov

Re: [Mesa-dev] [PATCH] radeonsi: use minnum and maxnum LLVM intrinsics for MIN and MAX opcodes

2014-11-22 Thread Matt Arsenault
> On Nov 22, 2014, at 7:35 AM, Marek Olšák wrote: > > AFAICS, the R600 backend doesn't implement the intrinsics for R600. > > Marek Should it? It’s trivial to switch to these for it, but I wasn’t sure what the actual semantics of its instructions were. There’s MAX and MAX_DX10, where I thin

Re: [Mesa-dev] [PATCH] radeonsi: add a debug flag for unsafe math LLVM optimizations

2016-06-13 Thread Matt Arsenault
> On Jun 13, 2016, at 09:27, Marek Olšák wrote: > > + { "unsafemath", DBG_UNSAFE_MATH, "Enable unsafe math shader > optimizations" }, Perhaps one for each of the individual fast math options as well (no nans, no signed zeros etc.)?___ mesa-dev m

Re: [Mesa-dev] [PATCH 0/2] clover: add clCompileProgram

2014-08-04 Thread Matt Arsenault
On Aug 4, 2014, at 8:03 AM, EdB wrote: > Hello > > I'm done with the clCompile part of OpenCL 1.2. > > As you can see I use char* data to transfert data from core to llvm. > > At first I was thinking of using std class but we need to be binary safe > when data are transfert beetween c++98/c++

Re: [Mesa-dev] [PATCH 5/5] clover: Enable cl_khr_fp64 for devices that support doubles v2

2014-08-13 Thread Matt Arsenault
On Jun 26, 2014, at 7:15 AM, Francisco Jerez wrote: > Tom Stellard writes: > >> v2: >> - Report correct values for CL_DEVICE_NATIVE_VECTOR_WIDTH_DOUBLE and >>CL_DEVICE_PREFERRED_VECTOR_WIDTH_DOUBLE. >> - Only define cl_khr_fp64 if the extension is supported. >> - Remove trailing space fr

Re: [Mesa-dev] [PATCH] radeonsi/gfx9: compile shaders with +xnack

2017-05-18 Thread Matt Arsenault
> On May 18, 2017, at 22:46, Marek Olšák wrote: > > From: Marek Olšák > > so that LLVM doesn't allocate SGPRs where XNACK is. > > Cc: 17.1 You shouldn’t be explicitly enabling xnack. This sounds like a workaround for a backend bug, and this has other consequences than changing the reserved

Re: [Mesa-dev] [PATCH 0/5] Volatile and invariant LDS memory ops

2017-11-09 Thread Matt Arsenault
> On Nov 10, 2017, at 07:41, Marek Olšák wrote: > > Hi, > > This fixes the TCS gl_ClipDistance piglit failure that was uncovered > by a recent LLVM change. The solution is to set volatile on loads > and stores to enforce proper ordering. > > Please review. > Every LDS access certainly shoul

Re: [Mesa-dev] [PATCH 3/4] ac/llvm: set xnack like radeonsi does.

2017-07-06 Thread Matt Arsenault
> On Jul 5, 2017, at 19:09, Dave Airlie wrote: > > From: Dave Airlie > > Use family, but only set xnack+ for gfx9. > The driver shouldn’t be explicitly setting this. This should be set as part of the subtarget chosen -Matt ___ mesa-dev mailing li

Re: [Mesa-dev] [PATCH 3/4] ac/llvm: set xnack like radeonsi does.

2017-07-06 Thread Matt Arsenault
> On Jul 6, 2017, at 13:08, Dave Airlie wrote: > > On 7 July 2017 at 05:07, Matt Arsenault wrote: >> >>> On Jul 5, 2017, at 19:09, Dave Airlie wrote: >>> >>> From: Dave Airlie >>> >>> Use family, but only set xnack+ for gfx9.

Re: [Mesa-dev] [PATCH 3/6] ac/nir: rewrite local variable handling

2017-07-06 Thread Matt Arsenault
> On Jul 6, 2017, at 18:31, Connor Abbott wrote: > > After looking into it some more, I think LLVM won't promote allocas to > registers at all when there are non-constant indices in the mix, and > fixing it seems kinda involved. I guess a better solution for now AMDGPUPromoteAlloca does this, b

Re: [Mesa-dev] [PATCH 3/6] ac/nir: rewrite local variable handling

2017-07-07 Thread Matt Arsenault
> On Jul 6, 2017, at 19:02, Connor Abbott wrote: > > On Thu, Jul 6, 2017 at 6:36 PM, Matt Arsenault wrote: >> >> On Jul 6, 2017, at 18:31, Connor Abbott wrote: >> >> After looking into it some more, I think LLVM won't promote allocas to >>

Re: [Mesa-dev] [PATCH] radv: enable denorms for 64-bit and 16-bit floats

2017-12-28 Thread Matt Arsenault
> On Dec 28, 2017, at 16:55, Samuel Pitoiset wrote: > > Similar to RadeonSI. > > This fixes: > dEQP-VK.image.texel_view_compatible.graphic.basic.attachment_read.bc*r16g16b16a16_sfloat > dEQP-VK.image.extended_usage_bit.attachment_write.r16_sfloat > > Signed-off-by: Samuel Pitoiset > --- > sr

Re: [Mesa-dev] [PATCH] radv: lower ffma in nir.

2017-10-03 Thread Matt Arsenault
> On Oct 3, 2017, at 13:58, Dave Airlie wrote: > > From: Dave Airlie > > So it appears the Vulkan SPIR-V fma opcode can be equivalent to a > mad operation, and the fma hw opcode on AMD hw is issued like a double > opcode so is slower. Also the radeonsi stack does this. > > This appears to imp

Re: [Mesa-dev] [PATCH] radv: emit fmuladd instead of fma to llvm.

2017-10-04 Thread Matt Arsenault
> On Oct 4, 2017, at 12:50, Marek Olšák wrote: > > The LLVM backends selects MAD (unfused) for fmuladd, and FMA (fused) for fma. For f64 and f16 by default it will emit an FMA since mad doesn’t support denorms.___ mesa-dev mailing list mesa-dev@lists

Re: [Mesa-dev] [PATCH] radeonsi: disable sinking common instructions down to the end block

2017-03-14 Thread Matt Arsenault
> On Mar 14, 2017, at 17:21, Samuel Pitoiset wrote: > > Initially this was a workaround for a bug introduced in LLVM 4.0 > in the SimplifyCFG pass that caused image instrinsics to disappear > (because they were badly sunk). Finally, this is a win because it > decreases SGPR spilling and increase

Re: [Mesa-dev] [PATCH] radv: flush f32->f16 conversion denormals to zero.

2017-03-16 Thread Matt Arsenault
> On Mar 16, 2017, at 20:02, Dave Airlie wrote: > > From: Dave Airlie > > SPIR-V defines the f32->f16 operation as flushing denormals to 0, > this compares the class using amd class opcode. > > Thanks to Matt Arsenault for figuring it o

Re: [Mesa-dev] [PATCH] ac, radv: fix removing the vec3 restriction on SI

2019-06-03 Thread Matt Arsenault
> On Jun 3, 2019, at 9:13 AM, Samuel Pitoiset wrote: > > I thought LLVM was able to handle that itself but actually it > does not. That means we shouldn't try to emit vec3 on SI because > it's unsupported. > It should. Can you file a bug with an example that doesn’t work? > Fixes: 6970a9a6