Re: [Mesa-dev] [PATCH 08/13] i965/vec4: Add support for untyped surface message sends from GRF.

2015-03-06 Thread Francisco Jerez
"Pohjolainen, Topi" writes: > On Fri, Feb 27, 2015 at 05:34:51PM +0200, Francisco Jerez wrote: >> This doesn't actually enable untyped surface message sends from GRF >> yet, the upcoming atomic counter and image intrinsic lowering code >> will. >> ---

Re: [Mesa-dev] [PATCH 01/13] i965: Factor out logic to build a send message instruction with indirect descriptor.

2015-03-06 Thread Francisco Jerez
"Pohjolainen, Topi" writes: > On Fri, Feb 27, 2015 at 05:34:44PM +0200, Francisco Jerez wrote: >> --- >> src/mesa/drivers/dri/i965/brw_eu.h | 19 ++-- >> src/mesa/drivers/dri/i965/brw_eu_emit.c | 58 >> +

Re: [Mesa-dev] [PATCH] clover: Return the minimum required value for CL_DEVICE_SINGLE_FP_CONFIG

2015-03-06 Thread Francisco Jerez
Tom Stellard writes: > On Thu, Mar 05, 2015 at 08:42:25PM +0200, Francisco Jerez wrote: >> Tom Stellard writes: >> >> > This means dropping CL_FP_DENORM from the current return value. >> > --- >> > src/gallium/state_trackers/clover/api/device.cpp |

Re: [Mesa-dev] [PATCH 1/2] clover: Return the minimum required value for CL_DEVICE_SINGLE_FP_CONFIG v2

2015-03-06 Thread Francisco Jerez
) > OpenCL 1.2 2.0 embedded profile minimum is RTZ or RTN if not TYPE_CUSTOM > (pages 352, and 262) > > sorry I did not catch the email yesterday. > Ah, you're right, I ended up looking at the embedded profile by accident. With the CL_FP_INF_NAN sentence left out this patch is:

Re: [Mesa-dev] [RFC] i965: Factor out descriptor building for indirect send messages

2015-03-07 Thread Francisco Jerez
Topi Pohjolainen writes: > The original patch from Curro was based on something that is not > present in the master yet. This patch tries to mimick the logic on > top master. > I wanted to see if could separate the building of the descriptor > instruction from building of the send instruction. Th

Re: [Mesa-dev] [PATCH v2] nv30: Add unused attribute to function nv40_fp_bra.

2015-03-07 Thread Francisco Jerez
Matt Turner writes: > On Fri, Mar 6, 2015 at 11:43 PM, Vinson Lee wrote: >> Silences GCC unused-function warning. >> >> nv30/nvfx_fragprog.c:333:1: warning: ‘nv40_fp_bra’ defined but not used >> [-Wunused-function] >> nv40_fp_bra(struct nvfx_fpc *fpc, unsigned target) >> ^ >> >> Signed-off-by

Re: [Mesa-dev] [PATCH v2] nv30: Add unused attribute to function nv40_fp_bra.

2015-03-07 Thread Francisco Jerez
Matt Turner writes: > On Sat, Mar 7, 2015 at 12:54 PM, Francisco Jerez > wrote: >> Matt Turner writes: >> >>> On Fri, Mar 6, 2015 at 11:43 PM, Vinson Lee wrote: >>>> Silences GCC unused-function warning. >>>> >>>> nv30/nvf

Re: [Mesa-dev] [RFC] i965: Factor out descriptor building for indirect send messages

2015-03-09 Thread Francisco Jerez
"Pohjolainen, Topi" writes: > On Sat, Mar 07, 2015 at 04:15:08PM +0200, Francisco Jerez wrote: >> Topi Pohjolainen writes: >> >> > The original patch from Curro was based on something that is not >> > present in the master yet. This patch tries to

Re: [Mesa-dev] [PATCH 4/4] Clover: use get_device_vendor instead of get_vendor

2015-03-09 Thread Francisco Jerez
Giuseppe Bilotta writes: > The pipe's get_vendor method returns something more akin to a driver > vendor string in most cases, instead of the actual device vendor. Use > get_device_vendor instead, which was introduced specifically for this > purpose. For this patch: Reviewed-by

Re: [Mesa-dev] [RFC] i965: Factor out descriptor building for indirect send messages

2015-03-10 Thread Francisco Jerez
"Pohjolainen, Topi" writes: > On Mon, Mar 09, 2015 at 12:43:08PM +0200, Francisco Jerez wrote: >> "Pohjolainen, Topi" writes: >> >> > On Sat, Mar 07, 2015 at 04:15:08PM +0200, Francisco Jerez wrote: >> >> Topi Pohjolainen writes: &

Re: [Mesa-dev] [PATCH v2 09/20] i965/fs: indirect addressing with doubles is not supported in IVB/BYT

2017-02-09 Thread Francisco Jerez
Samuel Iglesias Gonsálvez writes: > It is tested empirically that IVB/BYT don't support indirect addressing > with doubles but it is not documented in the PRM. > > This patch applies the same solution than for Cherryview/Broxton and > takes into account that we cannot double the stride, since the

Re: [Mesa-dev] [PATCH v2 15/20] i965/vec4: consider subregister offset in live variables

2017-02-09 Thread Francisco Jerez
Samuel Iglesias Gonsálvez writes: > From: "Juan A. Suarez Romero" > > Take in account the offset value when getting the var from register. > > This is required when dealing with an operation that writes half of the > register (like one d2x in IVB/BYT, which uses exec_size == 4). > > Note that fo

[Mesa-dev] [PATCHv3 09/20] i965/fs: Get 64-bit indirect moves working on IVB.

2017-02-09 Thread Francisco Jerez
--- This replaces "[PATCH v2 09/20] i965/fs: indirect addressing with doubles is not supported in IVB/BYT". src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 27 -- 1 file changed, 25 insertions(+), 2 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp

Re: [Mesa-dev] [PATCHv3 09/20] i965/fs: Get 64-bit indirect moves working on IVB.

2017-02-09 Thread Francisco Jerez
Francisco Jerez writes: > --- > This replaces "[PATCH v2 09/20] i965/fs: indirect addressing with > doubles is not supported in IVB/BYT". > Note that some of the fp64 indirect addressing test-cases still fail on IVB even with this patch applied, but the reason doesn'

Re: [Mesa-dev] [PATCH v2 09/20] i965/fs: indirect addressing with doubles is not supported in IVB/BYT

2017-02-10 Thread Francisco Jerez
Samuel Iglesias Gonsálvez writes: > On Thu, 2017-02-09 at 12:18 -0800, Francisco Jerez wrote: >> Samuel Iglesias Gonsálvez writes: >> >> > It is tested empirically that IVB/BYT don't support indirect >> > addressing >> > with doubles but it is n

Re: [Mesa-dev] [PATCHv3 09/20] i965/fs: Get 64-bit indirect moves working on IVB.

2017-02-10 Thread Francisco Jerez
Samuel Iglesias Gonsálvez writes: > On Thu, 2017-02-09 at 18:28 -0800, Francisco Jerez wrote: >> Francisco Jerez writes: >> >> > --- >> > This replaces "[PATCH v2 09/20] i965/fs: indirect addressing with >> > doubles is not supported in IV

Re: [Mesa-dev] [PATCHv3 09/20] i965/fs: Get 64-bit indirect moves working on IVB.

2017-02-10 Thread Francisco Jerez
Samuel Iglesias Gonsálvez writes: > On Thu, 2017-02-09 at 10:16 -0800, Francisco Jerez wrote: >> --- >> This replaces "[PATCH v2 09/20] i965/fs: indirect addressing with >> doubles is not supported in IVB/BYT". >> >> src/mesa

Re: [Mesa-dev] [PATCH 1/2] i965/ps: Use ForceThreadDispatchEnable instead of AccessUAV.

2017-02-13 Thread Francisco Jerez
Jason Ekstrand writes: > The AccessUAV bit is not quite what we want because it's more about > coherency between storage operations than it is about dispatch. Also, > the 3DSTATE_PS_EXTRA::PixelShaderHasUAV bit seems to cause hangs on > Broadwell for unknown reasons so it's best to just leave it

Re: [Mesa-dev] [Mesa-stable] [PATCH 1/2] i965/fs: fix indirect load DF uniforms on BSW/BXT

2017-02-14 Thread Francisco Jerez
Samuel Iglesias Gonsálvez writes: > Previously we were emitting two MOV_INDIRECT instructions by calculating > source's indirect offsets for each 32-bit half of a DF source. However, > this is not needed as we can just emit two 32-bit MOV INDIRECT without > doing that calculation. > Maybe mentio

Re: [Mesa-dev] [Mesa-stable] [PATCH 2/2] i965/fs: emit MOV_INDIRECT with the source with the right register type

2017-02-14 Thread Francisco Jerez
Samuel Iglesias Gonsálvez writes: > This was hiding bugs as it retyped the source to destination's type. > > Signed-off-by: Samuel Iglesias Gonsálvez > Cc: "17.0" Reviewed-by: Francisco Jerez > --- > src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 2 +- >

Re: [Mesa-dev] [PATCH v3 04/24] i965/fs: double regioning parameters and execsize for DF in IVB/BYT

2017-02-15 Thread Francisco Jerez
floats." > + * > + * Summarized: when handling DF-typed arguments, ExecSize, > + * VertStride, and Width must be doubled, and HorzStride must be > + * doubled when the region is not scalar. > + * The comment above seems misleading s

Re: [Mesa-dev] [PATCH v3 06/24] i965: Use <0, 2, 1> region for scalar DF sources on IVB/BYT.

2017-02-15 Thread Francisco Jerez
s_inst *inst, >unreachable("not reached"); > } > Maybe put a short comment here along the same lines as the commit message so you don't need to run git-blame to figure out what this is about? Either way patch is: Reviewed-by: Francisco Jerez > + if

Re: [Mesa-dev] [PATCH v3 08/24] i965/fs: fix dst stride in IVB/BYT type conversions

2017-02-15 Thread Francisco Jerez
Samuel Iglesias Gonsálvez writes: > From: "Juan A. Suarez Romero" > > When converting a DF to 32-bit conversions, we set dst stride to 2, > to fulfill alignment restrictions because the upper Dword of every > Qword will be written with undefined value. > > But in IVB/BYT, this is not necessary,

Re: [Mesa-dev] [PATCH v3 09/24] i965/fs: fix lower SIMD width for IVB/BYT's MOV_INDIRECT

2017-02-15 Thread Francisco Jerez
condition (Curro) > > v3: > - Add spec quote (Curro) > > Signed-off-by: Samuel Iglesias Gonsálvez > Signed-off-by: Juan A. Suarez Romero Reviewed-by: Francisco Jerez > --- > src/mesa/drivers/dri/i965/brw_fs.cpp | 17 ++--- > 1 file changed, 14 insertions

Re: [Mesa-dev] [PATCH v2 1/3] i965/fs: fix indirect load DF uniforms on BSW/BXT

2017-02-18 Thread Francisco Jerez
Samuel Iglesias Gonsálvez writes: > The lowered BSW/BXT indirect move instructions had incorrect > source types, which luckily wasn't causing incorrect assembly to be > generated due to the bug fixed in the next patch, but would have > confused the remaining back-end IR infrastructure due to the

Re: [Mesa-dev] [PATCH v2 2/3] i965/fs: fix source type when emitting MOV_INDIRECT to read ICP handles

2017-02-18 Thread Francisco Jerez
DER_OPCODE_MOV_INDIRECT, icp_handle, > - fs_reg(brw_vec8_grf(first_icp_handle, 0)), > + retype(fs_reg(brw_vec8_grf(first_icp_handle, 0)), > BRW_REGISTER_TYPE_UD), You could specify the type as icp_handle.type for consistency here. With that fi

Re: [Mesa-dev] [PATCH v2 1/3] i965/fs: fix indirect load DF uniforms on BSW/BXT

2017-02-20 Thread Francisco Jerez
Samuel Iglesias Gonsálvez writes: > On Mon, 2017-02-20 at 08:58 +0100, Samuel Iglesias Gonsálvez wrote: >> On Sat, 2017-02-18 at 18:58 -0800, Francisco Jerez wrote: >> > Samuel Iglesias Gonsálvez writes: >> > >> > > The lowered BSW/BXT indirect move i

Re: [Mesa-dev] [Mesa-stable] [PATCH v2 1/3] i965/fs: fix indirect load DF uniforms on BSW/BXT

2017-02-21 Thread Francisco Jerez
Samuel Iglesias Gonsálvez writes: > On 20/02/17 21:31, Francisco Jerez wrote: >> Samuel Iglesias Gonsálvez writes: >> >>> On Mon, 2017-02-20 at 08:58 +0100, Samuel Iglesias Gonsálvez wrote: >>>> On Sat, 2017-02-18 at 18:58 -0800, Francisco Jerez wrote: >

Re: [Mesa-dev] [Mesa-stable] [PATCH v2 1/3] i965/fs: fix indirect load DF uniforms on BSW/BXT

2017-02-22 Thread Francisco Jerez
Samuel Iglesias Gonsálvez writes: > On 21/02/17 21:07, Francisco Jerez wrote: >> Samuel Iglesias Gonsálvez writes: >> >>> On 20/02/17 21:31, Francisco Jerez wrote: >>>> Samuel Iglesias Gonsálvez writes: >>>> >>>>> On Mon, 2017-0

Re: [Mesa-dev] [Mesa-stable] [PATCH v3 1/3] i965/fs: mark last DF uniform array element as 64 bit live one

2017-02-25 Thread Francisco Jerez
Samuel Iglesias Gonsálvez writes: > This bug can make that we don't detect the end of a contiguous area > correctly and push larger areas than the real ones. > > Signed-off-by: Samuel Iglesias Gonsálvez > Cc: "17.0" Reviewed-by: Francisco Jerez > --- >

Re: [Mesa-dev] [PATCH v3 3/3] i965/fs: fix indirect load DF uniforms on BSW/BXT

2017-02-25 Thread Francisco Jerez
n BSW/BXT. > > v3: > - Move changes in assign_constant_locations() to other patch. > > Signed-off-by: Samuel Iglesias Gonsálvez > Cc: "17.0" Reviewed-by: Francisco Jerez > --- > src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 41 > >

Re: [Mesa-dev] [PATCH v2 1/1] clover: Dump linked binary to a different file

2017-02-25 Thread Francisco Jerez
's used in some places to disambiguate the top level llvm namespace, but it shouldn't be necessary for the std namespace). > + ::std::string id = "." + mod->getModuleIdentifier() + "-" + > + ::std::to_string(seq++); > + Mark as const. Wi

Re: [Mesa-dev] [Mesa-stable] [PATCH v3 2/3] i965/fs: detect different bit size accesses to uniforms to push them in proper locations

2017-02-27 Thread Francisco Jerez
nsigned *num_push_constants, > unsigned *num_pull_constants, I still feel like this function is growing over the top trying to achieve multiple unrelated things in one place, but if you just want to shuffle things around as little as possible for mesa-stable, fine: Review

Re: [Mesa-dev] [PATCH 07/20] i965/fs: Import image memory offset calculation code.

2015-07-24 Thread Francisco Jerez
eading > through this. You managed to handle an amazing number of cases > without so much as a single if statement or predicated instruction. > My hat's off to you sir. > Hah, I feel flattered. :) > Modulo Y-tiling and comments, > > Reviewed-by: Jason Ekstrand > >

Re: [Mesa-dev] [PATCH 01/12] i965/fs: Define logical texture sampling opcodes.

2015-07-24 Thread Francisco Jerez
Kenneth Graunke writes: > On Saturday, July 18, 2015 05:34:47 PM Francisco Jerez wrote: >> Each logical variant is largely equivalent to the original opcode but >> instead of taking a single payload source it expects the arguments >> separately as individual sources, like

Re: [Mesa-dev] [PATCH 01/12] i965/fs: Define logical texture sampling opcodes.

2015-07-24 Thread Francisco Jerez
Francisco Jerez writes: > Kenneth Graunke writes: > >> On Saturday, July 18, 2015 05:34:47 PM Francisco Jerez wrote: >>> Each logical variant is largely equivalent to the original opcode but >>> instead of taking a single payload source it expects the argument

Re: [Mesa-dev] [PATCH 4/5] i965/vec4: Don't emit scratch reads for a spilled register we have just written

2015-07-24 Thread Francisco Jerez
Iago Toral Quiroga writes: > When we have code such as this: > > mov vgrf1.0.x:F, vgrf2.:F > mov vgrf3.0.x:F, vgrf1.:F > ... > mov vgrf3.0.x:F, vgrf1.:F > > And vgrf1 is chosen for spilling, we can emit this: > > mov vgrf1.0.x:F, vgrf2.:F > gen4_scratch_write hw_reg0:F, vgrf1.

Re: [Mesa-dev] [PATCH 3/5] i965/vec4: Register spilling should never see registers with size != 1

2015-07-24 Thread Francisco Jerez
Iago Toral Quiroga writes: > Larger registers should have been moved to scratch (like GRF array access) > or split to size 1 by the split_virtual_grfs pass. Not necessarily. split_virtual_grfs() won't be able to split stuff which is read or written at once by the same instruction -- E.g. by sen

[Mesa-dev] [PATCH 04.5/12] i965/fs: Fix misleading comment regarding the message header in emit_texture_gen7.

2015-07-24 Thread Francisco Jerez
This hasn't been overallocating space for the header for a long time. It still leaves the header uninitialized though until the generator fixes it. --- src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/src/mesa/drivers/dri/i965/br

Re: [Mesa-dev] [PATCH 07/20] i965/fs: Import image memory offset calculation code.

2015-07-24 Thread Francisco Jerez
Jason Ekstrand writes: > On Jul 24, 2015 4:00 AM, "Francisco Jerez" wrote: >> >> Jason Ekstrand writes: >> >> > Ok, I've looked through this again and convinced myself that it's >> > *mostly* correct. I am a bit skeptical of the add

Re: [Mesa-dev] [PATCH 07/20] i965/fs: Import image memory offset calculation code.

2015-07-24 Thread Francisco Jerez
Jason Ekstrand writes: > On Fri, Jul 24, 2015 at 7:39 AM, Francisco Jerez > wrote: >> Jason Ekstrand writes: >> >>> On Jul 24, 2015 4:00 AM, "Francisco Jerez" wrote: >>>> >>>> Jason Ekstrand writes: >>>> >>>&g

Re: [Mesa-dev] [PATCH 07/20] i965/fs: Import image memory offset calculation code.

2015-07-24 Thread Francisco Jerez
Jason Ekstrand writes: > On Jul 24, 2015 8:02 AM, "Francisco Jerez" wrote: >> >> Jason Ekstrand writes: >> >> > On Fri, Jul 24, 2015 at 7:39 AM, Francisco Jerez > wrote: >> >> Jason Ekstrand writes: >> >> >> >&

[Mesa-dev] [PATCHv4 07/20] i965/fs: Import image memory offset calculation code.

2015-07-24 Thread Francisco Jerez
Define a function to calculate the memory address of the image location given by a vector of coordinates. This is required in cases where we need to fall back to untyped surface access, which take a raw memory offset and know nothing about surface coordinates, type conversion or memory tiling and

Re: [Mesa-dev] [PATCH] clover: Pass image attributes to the kernel

2015-07-25 Thread Francisco Jerez
Zoltan Gilian writes: > Read-only and write-only image arguments are recognized and > distinguished. > Attributes of the image arguments are passed to the kernel as implicit > arguments. > --- > src/gallium/state_trackers/clover/core/kernel.cpp | 46 ++- > src/gallium/state_trackers/clover

Re: [Mesa-dev] [PATCH 3/3] clover: add clLinkProgramm (CL 1.2)

2015-07-25 Thread Francisco Jerez
EdB writes: > --- > src/gallium/state_trackers/clover/api/program.cpp | 35 > ++ > src/gallium/state_trackers/clover/core/error.hpp | 7 + > src/gallium/state_trackers/clover/core/program.cpp | 4 +++ > src/gallium/state_trackers/clover/core/program.hpp | 1 + > ..

Re: [Mesa-dev] [PATCH 3/3 v4.1] clover: add clLinkProgramm (CL 1.2)

2015-07-26 Thread Francisco Jerez
he last two constructor arguments so you don't need to pass them at all. > + try { > + prog().link(devs, opts, progs); > + ret_error(r_errcode, CL_SUCCESS);; Double semicolon. With these fixed this patch is: Reviewed-by: Francisco Jerez > + } catch (link_error &

Re: [Mesa-dev] [PATCH] clover: Pass image attributes to the kernel

2015-07-26 Thread Francisco Jerez
score are reserved for the implementation at least on C and >> GLSL, not sure about OpenCL-C) > > I believe this is true for OpenCL C too, since it is an extension to > C99. Identifiers starting with double underscores are reserved in C99. > IIRC identifiers starting with double

Re: [Mesa-dev] [PATCH] clover: Pass image attributes to the kernel

2015-07-27 Thread Francisco Jerez
27;t >>> touch the explicit_arg iterator at all AFAICT, so it will be left >>> pointing one past the last general semantic argument >> >> Ok, my mistake, I didn't think this through. >> >>> Hmmm... So you only need it as padding? Wouldn't it be

Re: [Mesa-dev] [PATCH 1/2] clover: move find_kernels to functions

2015-07-27 Thread Francisco Jerez
kernels, address_spaces); > + m = build_module_llvm(mod, address_spaces); > break; >case PIPE_SHADER_IR_NATIVE: { > std::vector code = compile_native(mod, triple, processor, > get_debug_flags() & DBG_ASM, &g

Re: [Mesa-dev] [PATCH 2/2] clover: pass image attributes to the kernel

2015-07-27 Thread Francisco Jerez
/ Image format implicit argument > + if (type_name == "__llvm_image_format") { > +args.push_back(module::argument(module::argument::scalar, > + sizeof(cl_uint), > +TD.getTypeStoreS

[Mesa-dev] [PATCH] i965/fs: Clamp image array indices to the array bounds on IVB.

2015-07-27 Thread Francisco Jerez
This fixes the spec@arb_shader_image_load_store@invalid index bounds piglit tests on IVB, which were causing a GPU hang and then a crash due to the invalid binding table index result of the array index calculation. Other generations seem to behave sensibly when an invalid surface is provided so it

[Mesa-dev] [PATCHv2] i965/fs: Factor out source components calculation to a separate method.

2015-07-27 Thread Francisco Jerez
This cleans up fs_inst::regs_read() slightly by disentangling the calculation of "components" from the handling of message payload arguments. This will also simplify the SIMD lowering and logical send message lowering passes, because it will avoid expressions like 'regs_read * REG_SIZE / component

Re: [Mesa-dev] [PATCH 14/32] i965/fs: Fix register coalesce not to lose track of the second half of 16-wide moves.

2015-07-27 Thread Francisco Jerez
Francisco Jerez writes: > Matt Turner writes: > >> On Fri, Feb 6, 2015 at 6:42 AM, Francisco Jerez >> wrote: >>> Fixes rewrite by the register coalesce pass of references to >>> individual halves of 16-wide coalesced registers. &g

Re: [Mesa-dev] [PATCH 14/32] i965/fs: Fix register coalesce not to lose track of the second half of 16-wide moves.

2015-07-27 Thread Francisco Jerez
Jason Ekstrand writes: > On Fri, Feb 6, 2015 at 6:42 AM, Francisco Jerez wrote: >> Fixes rewrite by the register coalesce pass of references to >> individual halves of 16-wide coalesced registers. >> --- >> src/mesa/drivers/dri/i965/brw_fs_register_coalesce.cpp | 8

Re: [Mesa-dev] [PATCH 14/32] i965/fs: Fix register coalesce not to lose track of the second half of 16-wide moves.

2015-07-27 Thread Francisco Jerez
Francisco Jerez writes: > Jason Ekstrand writes: > >> On Fri, Feb 6, 2015 at 6:42 AM, Francisco Jerez >> wrote: >>> Fixes rewrite by the register coalesce pass of references to >>> individual halves of 16-wide coalesced registers. &g

[Mesa-dev] [PATCH 11/14] i965/fs: Switch lower_logical_sends() to the fs_builder constructor from instruction.

2015-07-28 Thread Francisco Jerez
--- src/mesa/drivers/dri/i965/brw_fs.cpp | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp index 2775d98..57e4dd7 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp +++ b/src/mesa/drivers/dri/i965/brw_

[Mesa-dev] [PATCH 03/14] i965/fs: Set execution controls correctly for lowered pull constant loads.

2015-07-28 Thread Francisco Jerez
demote_pull_constants() was ignoring the execution size and channel selects of the instruction that wanted the constant, which doesn't matter for uniform pull constant loads because all channels get the same scalar value, but it might for varying pull constant loads. Fix it by using the new fs_bui

[Mesa-dev] [PATCH 13/14] i965/fs: Don't set exec_all on instructions wider than the original in lower_simd_width.

2015-07-28 Thread Francisco Jerez
This could have led to somewhat increased bandwidth usage for lowered texturing instructions on Gen4 (which is the only case in which lower_width may be greater than inst->exec_size). After the previous patches the invariant mentioned in the comment should no longer be assumed by any of the other

[Mesa-dev] [PATCH 10/14] i965/fs: Switch lower_load_payload() to the fs_builder constructor from instruction.

2015-07-28 Thread Francisco Jerez
--- src/mesa/drivers/dri/i965/brw_fs.cpp | 7 ++- 1 file changed, 2 insertions(+), 5 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp index 8bc9372..2775d98 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp +++ b/src/mesa/drivers/dri/i965/

[Mesa-dev] [PATCH 09/14] i965/fs: Don't rely on the default builder to create a null register in emit_spill.

2015-07-28 Thread Francisco Jerez
It's not guaranteed to have the same width as the instruction generating the spilled variable. --- src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp b/src/mesa/drivers/dri/i965/br

[Mesa-dev] [PATCH 01/14] i965/fs: Define a new fs_builder constructor taking an instruction as argument.

2015-07-28 Thread Francisco Jerez
We have a number of optimization passes that repeat the same pattern before inserting new instructions into the program based on some previous instruction: They point the default builder at the original instruction, then call exec_all() and group() to select the same execution controls the original

[Mesa-dev] [PATCH 14/14] i965/fs: Make the default builder 64-wide before entering the optimization loop.

2015-07-28 Thread Francisco Jerez
Not a typo. Replace the default builder with one of bogus width to catch cases in which optimization passes assume that the default dispatch width is good enough. The execution controls of instructions emitted during optimization should in general match the original code that is being manipulated

[Mesa-dev] [PATCH 07/14] i965/fs: Set up the builder execution size explicitly in opt_sampler_eot().

2015-07-28 Thread Francisco Jerez
opt_sampler_eot() was relying on the default builder to have the same width as the sampler and FB write opcodes it was eliminating, the channel selects didn't matter because the builder was only being used to allocate registers, no new instructions were being emitted with it. A future commit will c

[Mesa-dev] [PATCH 06/14] i965/fs: Initialize a builder explicitly in opt_peephole_predicated_break().

2015-07-28 Thread Francisco Jerez
This wasn't taking into account the execution controls of the original instruction, but it was most likely not a bug because control flow instructions are typically full width. --- src/mesa/drivers/dri/i965/brw_fs_peephole_predicated_break.cpp | 8 +--- 1 file changed, 5 insertions(+), 3 delet

[Mesa-dev] [PATCH 12/14] i965/fs: Switch opt_cse() to the fs_builder constructor from instruction.

2015-07-28 Thread Francisco Jerez
--- src/mesa/drivers/dri/i965/brw_fs_cse.cpp | 16 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs_cse.cpp b/src/mesa/drivers/dri/i965/brw_fs_cse.cpp index e33fe6a..a123ff2 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_cse.cpp +++ b/

[Mesa-dev] [PATCH 05/14] i965/fs: Set execution controls explicitly in opt_peephole_sel().

2015-07-28 Thread Francisco Jerez
Emit the SELs and MOVs with the same execution controls as the original MOVs, and the CMP with the same execution controls as the IF. Also explicitly check that the execution controls of any pair of MOVs being folded into a SEL are compatible (which is almost always going to be the case), since oth

[Mesa-dev] [PATCH 08/14] i965/fs: Initialize a builder explicitly in the gen4 send dependency work-arounds.

2015-07-28 Thread Francisco Jerez
Instead of relying on the default one. This shouldn't lead to any functional changes because DEP_RESOLVE_MOV overrides the execution controls of the instruction anyway. --- src/mesa/drivers/dri/i965/brw_fs.cpp | 11 +++ 1 file changed, 7 insertions(+), 4 deletions(-) diff --git a/src/mes

[Mesa-dev] [PATCH 02/14] i965/fs: Set the execution size of the MOVs correctly in opt_combine_constants().

2015-07-28 Thread Francisco Jerez
The execution size was being left equal to the default of 8/16, which AFAICT would have overwritten components other than the one we wanted to initialize and could potentially have corrupted other registers. --- src/mesa/drivers/dri/i965/brw_fs_combine_constants.cpp | 2 +- 1 file changed, 1 inser

[Mesa-dev] [PATCH 04/14] i965/fs: Set execution controls correctly in lower_integer_multiplication().

2015-07-28 Thread Francisco Jerez
lower_integer_multiplication() was ignoring the execution controls of the original MUL instruction. Fix it by using the new fs_builder constructor. --- src/mesa/drivers/dri/i965/brw_fs.cpp | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/

[Mesa-dev] [PATCH 2/3] i965/fs: Fix rewrite of the second half of 16-wide coalesced registers.

2015-07-28 Thread Francisco Jerez
The register coalesce pass wasn't rewriting the destination and sources of instructions that accessed the second half of a coalesced register previously copied with a 16-wide MOV instruction. E.g.: | ADD (16) vgrf0:f, vgrf0:f, 1.0:f | MOV (16) vgrf1:f, vgrf0:f | MOV (8) vgrf2:f, vgrf0+1:f { sech

[Mesa-dev] [PATCH 1/3] i965/fs: Detect multi-register MOVs correctly in register_coalesce.

2015-07-28 Thread Francisco Jerez
register_coalesce() was considering the exec_size of the MOV instruction alone to decide whether the register at offset+1 of the source VGRF was being copied to inst->dst.reg_offset+1 of the destination VGRF, which is only a valid assumption if the move has a 32-bit execution type. Use regs_read()

[Mesa-dev] [PATCH 3/3] i965/fs: Simplify instruction rewrite loop in the register coalesce pass.

2015-07-28 Thread Francisco Jerez
For some reason the loop that rewrites all occurrences of the coalesced register was iterating over all possible offsets until it would find one that compares equal to the offset of a source or destination of any instruction in the program. Since the mapping between old and new offsets is already

Re: [Mesa-dev] [PATCH 1/3] i965/fs: Detect multi-register MOVs correctly in register_coalesce.

2015-07-28 Thread Francisco Jerez
Jason Ekstrand writes: > On Jul 28, 2015 2:43 AM, "Francisco Jerez" wrote: >> >> register_coalesce() was considering the exec_size of the MOV >> instruction alone to decide whether the register at offset+1 of the >> source VGRF was being copied to inst-&g

Re: [Mesa-dev] [PATCH v2 0/6] Improvements to the vec4 spilling code

2015-07-28 Thread Francisco Jerez
Iago Toral Quiroga writes: > Link to v1: > http://lists.freedesktop.org/archives/mesa-dev/2015-July/089766.html > > Changes after review (Curro) > - Drop the patch that asserted that the reg size should always be 1 > - Expand this so that we do not unspill a register if we have just > uns

Re: [Mesa-dev] [PATCH 08/14] i965/fs: Initialize a builder explicitly in the gen4 send dependency work-arounds.

2015-07-29 Thread Francisco Jerez
Jason Ekstrand writes: > On Tue, Jul 28, 2015 at 1:23 AM, Francisco Jerez > wrote: >> Instead of relying on the default one. This shouldn't lead to any >> functional changes because DEP_RESOLVE_MOV overrides the execution >> controls of the instruction anyway.

Re: [Mesa-dev] [PATCH v2 0/6] Improvements to the vec4 spilling code

2015-07-29 Thread Francisco Jerez
Iago Toral writes: > On Tue, 2015-07-28 at 18:17 +0300, Francisco Jerez wrote: >> Iago Toral Quiroga writes: >> >> > Link to v1: >> > http://lists.freedesktop.org/archives/mesa-dev/2015-July/089766.html >> > >> > Changes after review (Curro)

Re: [Mesa-dev] [PATCH 08/14] i965/fs: Initialize a builder explicitly in the gen4 send dependency work-arounds.

2015-07-29 Thread Francisco Jerez
Jason Ekstrand writes: > On Jul 29, 2015 3:12 AM, "Francisco Jerez" wrote: >> >> Jason Ekstrand writes: >> >> > On Tue, Jul 28, 2015 at 1:23 AM, Francisco Jerez > wrote: >> >> Instead of relying on the default one. This shouldn't

[Mesa-dev] [PATCH] i965/fs: Fix regression with SIMD8 VS since b5f1a48e234d47b24df38cb562cffb8941d43795.

2015-07-30 Thread Francisco Jerez
With num_direct_uniforms == 0 there's no space allocated in the param_size array for the one block of direct uniforms -- On the FS stage this would be a harmless no-op because it would simply re-set one of the param_size entries allocated for the sampler units to zero, but on the VS stage it has be

Re: [Mesa-dev] [PATCH v2 6/6] i965: Add a debug option for spilling everything in vec4 code

2015-07-30 Thread Francisco Jerez
UG_SPILL }, > + { "spill_frag", DEBUG_SPILL_FS }, How about we call this "spill_fs" instead? The flag doesn't only affect fragment shaders, AFAICT it will cause all programs compiled with the FS back-end [F for fast ;)] to spill everything. With that fixed: Review

Re: [Mesa-dev] [PATCH v2 6/6] i965: Add a debug option for spilling everything in vec4 code

2015-07-30 Thread Francisco Jerez
Iago Toral writes: > On Thu, 2015-07-30 at 15:58 +0300, Francisco Jerez wrote: >> Iago Toral Quiroga writes: >> >> > --- >> > src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp | 2 +- >> > src/mesa/drivers/dri/i965/brw_vec4.cpp| 2 +- >

Re: [Mesa-dev] [PATCH v2 5/6] i965/vec4: Adjust spilling cost for consecutive instructions

2015-07-30 Thread Francisco Jerez
Iago Toral Quiroga writes: > Previous patches made it so that we do not need to unspill the same vgrf > with every instruction as long as these instructions come right after > the register was spilled or unspilled. This means that actually spilling > the register is now cheaper in these scenarios

Re: [Mesa-dev] [PATCH v2 2/6] i965/vec4: Remove checks for reladdr when checking for spillable registers

2015-07-30 Thread Francisco Jerez
Iago Toral Quiroga writes: > In theory, GRF array access should have been moved to scratch by the time > we got here, so this should never happen. A full piglit run forcing > spilling of all registers seems to confirm this. The FS backend > does not seem to check for this either. > --- > src/mes

Re: [Mesa-dev] [PATCH v2 5/6] i965/vec4: Adjust spilling cost for consecutive instructions

2015-07-30 Thread Francisco Jerez
Francisco Jerez writes: > Iago Toral Quiroga writes: > >> Previous patches made it so that we do not need to unspill the same vgrf >> with every instruction as long as these instructions come right after >> the register was spilled or unspilled. This means that actually

Re: [Mesa-dev] [PATCH v2 5/6] i965/vec4: Adjust spilling cost for consecutive instructions

2015-07-30 Thread Francisco Jerez
Francisco Jerez writes: > Iago Toral Quiroga writes: > >> Previous patches made it so that we do not need to unspill the same vgrf >> with every instruction as long as these instructions come right after >> the register was spilled or unspilled. This means that actually

Re: [Mesa-dev] [PATCH v2 3/6] i965/vec4: Don't emit scratch reads for a spilled register we have just written

2015-07-30 Thread Francisco Jerez
Iago Toral Quiroga writes: > When we have code such as this: > > mov vgrf1.0.x:F, vgrf2.:F > mov vgrf3.0.x:F, vgrf1.:F > ... > mov vgrf3.0.x:F, vgrf1.:F > > And vgrf1 is chosen for spilling, we can emit this: > > mov vgrf1.0.x:F, vgrf2.:F > gen4_scratch_write hw_reg0:F, vgrf1.

Re: [Mesa-dev] [PATCH v2 3/6] i965/vec4: Don't emit scratch reads for a spilled register we have just written

2015-07-30 Thread Francisco Jerez
Francisco Jerez writes: > Iago Toral Quiroga writes: > >> When we have code such as this: >> >> mov vgrf1.0.x:F, vgrf2.:F >> mov vgrf3.0.x:F, vgrf1.:F >> ... >> mov vgrf3.0.x:F, vgrf1.:F >> >> And vgrf1 is chosen for spilling,

Re: [Mesa-dev] [PATCH 9/9] glsl: Add constuctors for the common cases of glsl_struct_field

2015-07-30 Thread Francisco Jerez
field structures had to be converted to use the > constructor because C++ apparently forces you to do one or the other: > > builtin_types.cpp:61:1: error: could not convert '{glsl_type::float_type, > "near", -1, 0, 0, 0, GLSL_MATRIX_LAYOUT_INHERITED, 0, -1}' from &g

Re: [Mesa-dev] [PATCH v2 3/6] i965/vec4: Don't emit scratch reads for a spilled register we have just written

2015-07-31 Thread Francisco Jerez
Iago Toral writes: > On Thu, 2015-07-30 at 17:08 +0300, Francisco Jerez wrote: >> Iago Toral Quiroga writes: >> >> > When we have code such as this: >> > >> > mov vgrf1.0.x:F, vgrf2.:F >> > mov vgrf3.0.x:F, vgrf1.:F >> > .

Re: [Mesa-dev] [PATCH v2 3/6] i965/vec4: Don't emit scratch reads for a spilled register we have just written

2015-07-31 Thread Francisco Jerez
Iago Toral writes: > On Thu, 2015-07-30 at 17:14 +0300, Francisco Jerez wrote: >> Francisco Jerez writes: >> >> > Iago Toral Quiroga writes: >> > >> >> When we have code such as this: >> >> >> >> mov vgrf1.0.x:F, vgrf2.:

Re: [Mesa-dev] [PATCH v2 3/6] i965/vec4: Don't emit scratch reads for a spilled register we have just written

2015-07-31 Thread Francisco Jerez
Francisco Jerez writes: > Iago Toral writes: > >> On Thu, 2015-07-30 at 17:08 +0300, Francisco Jerez wrote: >>> Iago Toral Quiroga writes: >>> >>> > When we have code such as this: >>> > >>> > mov vgrf1.0.x:F, vgrf2.:F &

Re: [Mesa-dev] [PATCH] i965/fs: Fix regression with SIMD8 VS since b5f1a48e234d47b24df38cb562cffb8941d43795.

2015-07-31 Thread Francisco Jerez
> /Marta > >> -----Original Message- >> From: Francisco Jerez [mailto:curroje...@riseup.net] >> Sent: Thursday, July 30, 2015 2:23 PM >> To: mesa-dev@lists.freedesktop.org >> Cc: Lofstedt, Marta >> Subject: [PATCH] i965/fs: Fix regressi

Re: [Mesa-dev] i965 implementation of the ARB_shader_image_load_store built-ins. (v4)

2015-07-31 Thread Francisco Jerez
horizontal_slice_pitch(). [1] http://lists.freedesktop.org/archives/mesa-dev/2015-February/076392.html [2] http://lists.freedesktop.org/archives/mesa-dev/2015-May/084141.html > --Jason > > On Thu, Jul 23, 2015 at 6:58 AM, Francisco Jerez > wrote: >> Jason Ekstrand writes:

Re: [Mesa-dev] [PATCH 10/20] i965/fs: Implement image load, store and atomic.

2015-07-31 Thread Francisco Jerez
Jason Ekstrand writes: > On Thu, Jul 23, 2015 at 4:38 AM, Francisco Jerez > wrote: >> Jason Ekstrand writes: >> >>> This all looks correct as far as I can tell. However, I'm very >>> concerned about the number of checks such as >>> ha

Re: [Mesa-dev] [PATCH] i965/fs: Fix regression with SIMD8 VS since b5f1a48e234d47b24df38cb562cffb8941d43795.

2015-07-31 Thread Francisco Jerez
to delay the fix any further, I'll push it shortly. :) >> -Original Message- >> From: Francisco Jerez [mailto:curroje...@riseup.net] >> Sent: Friday, July 31, 2015 2:07 PM >> To: Lofstedt, Marta; mesa-dev@lists.freedesktop.org >> Subject: RE: [PATCH] i965/fs: Fix regress

Re: [Mesa-dev] [PATCH] clover: handle setKernelArg errors

2015-07-31 Thread Francisco Jerez
l::image_wr_argument::unbind(exec_context &ctx) { > > void > kernel::sampler_argument::set(size_t size, const void *value) { > + if (!value) > + throw error(CL_INVALID_SAMPLER); > + > if (size != sizeof(cl_sampler)) >throw error(CL_INVALID_ARG_SIZE); &g

Re: [Mesa-dev] [PATCH 1/2] clover: make dispatch matches functions def

2015-07-31 Thread Francisco Jerez
e, No space after '*'. > + size_t * param_value_size_ret); > Same here. With these fixed this patch is: Reviewed-by: Francisco Jerez > CL_API_ENTRY cl_int (CL_API_CALL *clEnqueueFillBuffer)( >cl_command_queue command_queue, > @@ -701,7 +707,7

Re: [Mesa-dev] [PATCH] clover: handle setKernelArg errors

2015-07-31 Thread Francisco Jerez
Zoltán Gilián writes: > Could you please commit this? I don't have permissions. > Sure, I'll put them into my queue. > On Fri, Jul 31, 2015 at 3:55 PM, Francisco Jerez > wrote: >> Zoltan Gilian writes: >> >>> --- >>> sr

Re: [Mesa-dev] [PATCH 10/20] i965/fs: Implement image load, store and atomic.

2015-07-31 Thread Francisco Jerez
Jason Ekstrand writes: > On Fri, Jul 31, 2015 at 6:15 AM, Francisco Jerez > wrote: >> Jason Ekstrand writes: >> >>> On Thu, Jul 23, 2015 at 4:38 AM, Francisco Jerez >>> wrote: >>>> Jason Ekstrand writes: >>>> >>&

Re: [Mesa-dev] [PATCH] clover: clEnqueue* should block when asked for

2015-08-03 Thread Francisco Jerez
EdB writes: > As a side effect, this fix clRetain/ReleaseEvent Piglit test > --- > src/gallium/state_trackers/clover/api/transfer.cpp | 29 > -- > 1 file changed, 27 insertions(+), 2 deletions(-) > > diff --git a/src/gallium/state_trackers/clover/api/transfer.cpp > b/src/ga

Re: [Mesa-dev] [PATCH] clover: fix image resource depth and array_size

2015-08-03 Thread Francisco Jerez
1; > } else { >info.width0 = obj.size(); >info.height0 = 1; Any reason you didn't fix this other branch to do the same? Or maybe just init it to one after the if? With that fixed: Reviewed-by: Francisco Jerez > -- > 2.4.6 signature.asc D

<    1   2   3   4   5   6   7   8   9   10   >