[Mesa-dev] [PATCH 10/14] i965: Hook up image state upload.

2015-02-06 Thread Francisco Jerez
--- src/mesa/drivers/dri/i965/brw_context.h | 8 +++- src/mesa/drivers/dri/i965/brw_gs_surface_state.c | 25 src/mesa/drivers/dri/i965/brw_state.h| 3 ++ src/mesa/drivers/dri/i965/brw_state_upload.c | 10 + src/mesa/drivers/dri/i965/brw_vs_surface_state.

[Mesa-dev] [PATCH 09/14] i965: Reserve enough parameter entries for all image uniforms used in the program.

2015-02-06 Thread Francisco Jerez
--- src/mesa/drivers/dri/i965/brw_gs.c | 1 + src/mesa/drivers/dri/i965/brw_vs.c | 4 ++-- src/mesa/drivers/dri/i965/brw_wm.c | 3 ++- 3 files changed, 5 insertions(+), 3 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_gs.c b/src/mesa/drivers/dri/i965/brw_gs.c index ce3cba4..bfb64f3 1006

[Mesa-dev] [PATCH 04/14] i965/gen7: Factor out texture surface state set-up from gen7_update_texture_surface().

2015-02-06 Thread Francisco Jerez
This moves most of the surface state set-up logic that can be shared between textures and shader images to a separate function. --- src/mesa/drivers/dri/i965/brw_context.h | 11 ++ src/mesa/drivers/dri/i965/gen7_wm_surface_state.c | 124 +- 2 files changed, 83 insert

[Mesa-dev] [PATCH 03/14] i965: Add helper functions to calculate the slice pitch of an array or 3D miptree.

2015-02-06 Thread Francisco Jerez
--- src/mesa/drivers/dri/i965/brw_tex_layout.c| 45 +-- src/mesa/drivers/dri/i965/intel_mipmap_tree.h | 18 +++ 2 files changed, 53 insertions(+), 10 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_tex_layout.c b/src/mesa/drivers/dri/i965/brw_tex_layou

[Mesa-dev] [PATCH 06/14] i965: Generalize the update_null_renderbuffer_surface vtbl hook to non-renderbuffers.

2015-02-06 Thread Francisco Jerez
Null surfaces are going to be useful to have something to point unbound image units to, as the ARB_shader_image_load_store extension requires us to behave deterministically in cases where some shader tries to access an unbound image unit: Invalid stores and atomics are supposed to be discarded and

[Mesa-dev] [PATCH 07/14] i965: Implement surface state set-up for shader images.

2015-02-06 Thread Francisco Jerez
--- src/mesa/drivers/dri/i965/brw_context.h | 2 + src/mesa/drivers/dri/i965/brw_surface_formats.c | 111 +++ src/mesa/drivers/dri/i965/brw_wm_surface_state.c | 77 3 files changed, 190 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_cont

[Mesa-dev] [PATCH 13/14] i965/gen7-8: Set up early depth/stencil control appropriately for image load/store.

2015-02-06 Thread Francisco Jerez
--- src/mesa/drivers/dri/i965/brw_defines.h | 3 +++ src/mesa/drivers/dri/i965/gen7_wm_state.c| 7 +++ src/mesa/drivers/dri/i965/gen8_depth_state.c | 12 src/mesa/drivers/dri/i965/gen8_ps_state.c| 13 + 4 files changed, 31 insertions(+), 4 deletions(-)

[Mesa-dev] [PATCH 05/14] i965/gen8: Factor out texture surface state set-up from gen8_update_texture_surface().

2015-02-06 Thread Francisco Jerez
This moves most of the surface state set-up logic that can be shared between textures and shader images to a separate function. --- src/mesa/drivers/dri/i965/gen8_surface_state.c | 136 ++--- 1 file changed, 77 insertions(+), 59 deletions(-) diff --git a/src/mesa/drivers/dri/i

[Mesa-dev] [PATCH 12/14] i965/gen7-8: Poke the 3DSTATE UAV access enable bits.

2015-02-06 Thread Francisco Jerez
--- src/mesa/drivers/dri/i965/brw_defines.h | 3 +++ src/mesa/drivers/dri/i965/gen7_gs_state.c | 4 +++- src/mesa/drivers/dri/i965/gen7_vs_state.c | 13 - src/mesa/drivers/dri/i965/gen7_wm_state.c | 3 +++ src/mesa/drivers/dri/i965/gen8_gs_state.c | 4 +++- src/mesa/drivers/dri/i

Re: [Mesa-dev] [PATCH 10/32] i965/fs: Remove logic to keep track of MRF metadata in lower_load_payload().

2015-02-06 Thread Francisco Jerez
Hey Matt, Matt Turner writes: > On Fri, Feb 6, 2015 at 6:42 AM, Francisco Jerez wrote: >> MRFs cannot be read from anyway so they cannot possibly be a valid >> source of LOAD_PAYLOAD. >> --- > > The function only seems to test inst->dst.file == MRF. I don

Re: [Mesa-dev] [PATCH 09/32] i965/fs: Fix fs_inst::regs_written calculation for instructions with scalar dst.

2015-02-06 Thread Francisco Jerez
Matt Turner writes: > On Fri, Feb 6, 2015 at 6:42 AM, Francisco Jerez wrote: >> Scalar registers are required to have zero stride, fix the >> regs_written calculation not to assume that the instruction writes >> zero registers in that case. >> --- >> src/mes

Re: [Mesa-dev] [PATCH 17/32] i965/vec4: Fix constant propagation across different types.

2015-02-06 Thread Francisco Jerez
Matt Turner writes: > On Fri, Feb 6, 2015 at 6:42 AM, Francisco Jerez wrote: >> If the source type differs from the original type of the constant we >> need to bit-cast it before propagating, otherwise the original type >> information will be lost. If the constant was

Re: [Mesa-dev] [PATCH 01/14] i965: Don't tile 1D miptrees.

2015-02-06 Thread Francisco Jerez
Kenneth Graunke writes: > On Friday, February 06, 2015 07:23:15 PM Francisco Jerez wrote: >> It doesn't really improve locality of texture fetches, quite the >> opposite it's a waste of memory bandwidth and space due to tile >> alignment. >> --- >> sr

Re: [Mesa-dev] [PATCH 02/14] i965: Allocate binding table space for shader images.

2015-02-06 Thread Francisco Jerez
Kenneth Graunke writes: > On Friday, February 06, 2015 07:23:16 PM Francisco Jerez wrote: >> Reviewed-by: Paul Berry >> --- >> src/mesa/drivers/dri/i965/brw_context.h | 5 + >> src/mesa/drivers/dri/i965/brw_shader.cpp | 7 +++ >> 2 files changed, 12

Re: [Mesa-dev] [PATCH 09/32] i965/fs: Fix fs_inst::regs_written calculation for instructions with scalar dst.

2015-02-08 Thread Francisco Jerez
Kenneth Graunke writes: > On Saturday, February 07, 2015 02:10:19 AM Francisco Jerez wrote: >> Matt Turner writes: >> >> > On Fri, Feb 6, 2015 at 6:42 AM, Francisco Jerez >> > wrote: >> >> Scalar registers are required to have zero stride, fix the

Re: [Mesa-dev] [PATCH 17/32] i965/vec4: Fix constant propagation across different types.

2015-02-08 Thread Francisco Jerez
Matt Turner writes: > On Fri, Feb 6, 2015 at 4:17 PM, Francisco Jerez wrote: >> Matt Turner writes: >> >>> On Fri, Feb 6, 2015 at 6:42 AM, Francisco Jerez >>> wrote: >>>> If the source type differs from the original type of the constant w

Re: [Mesa-dev] [PATCH 26/32] i965/vec4: Don't assume a value is dead when its VGRF is only partially overwritten.

2015-02-08 Thread Francisco Jerez
Matt Turner writes: > On Fri, Feb 6, 2015 at 6:43 AM, Francisco Jerez wrote: >> --- >> src/mesa/drivers/dri/i965/brw_vec4_dead_code_eliminate.cpp | 3 ++- >> 1 file changed, 2 insertions(+), 1 deletion(-) >> >> diff --git a/src/mesa/drivers/dri/i965/brw_vec4_

Re: [Mesa-dev] [PATCH 08/32] i965/fs: Fix stack allocation of fs_inst and stop stealing src array provided on construction.

2015-02-09 Thread Francisco Jerez
Matt Turner writes: > On Fri, Feb 6, 2015 at 6:42 AM, Francisco Jerez wrote: >> Using 'ralloc*(this, ...)' is wrong if the object has automatic >> storage or was allocated through any other means. Use normal dynamic >> memory instead. >> --- > > I

Re: [Mesa-dev] [PATCH 14/32] i965/fs: Fix register coalesce not to lose track of the second half of 16-wide moves.

2015-02-09 Thread Francisco Jerez
Matt Turner writes: > On Fri, Feb 6, 2015 at 6:42 AM, Francisco Jerez wrote: >> Fixes rewrite by the register coalesce pass of references to >> individual halves of 16-wide coalesced registers. >> --- >> src/mesa/drivers/dri/i965/brw_fs_register_coalesce.cpp | 8 +++

Re: [Mesa-dev] [PATCH 18/32] i965/vec4: Don't attempt to reduce swizzles of send from GRF instructions.

2015-02-09 Thread Francisco Jerez
Matt Turner writes: > On Fri, Feb 6, 2015 at 6:42 AM, Francisco Jerez wrote: >> --- >> src/mesa/drivers/dri/i965/brw_vec4.cpp | 3 ++- >> 1 file changed, 2 insertions(+), 1 deletion(-) >> >> diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp >> b

Re: [Mesa-dev] [PATCH 19/32] i965/vec4: Pass dst register to the vec4_instruction constructor.

2015-02-09 Thread Francisco Jerez
Matt Turner writes: > On Fri, Feb 6, 2015 at 6:42 AM, Francisco Jerez wrote: >> So regs_written gets initialized with a sensible value. >> --- >> src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 11 +-- >> 1 file changed, 5 insertions(+), 6 deletions(-)

Re: [Mesa-dev] [PATCH 21/32] i965/vec4: Fix the scheduler to take into account reads and writes of multiple registers.

2015-02-09 Thread Francisco Jerez
Matt Turner writes: > On Fri, Feb 6, 2015 at 6:43 AM, Francisco Jerez wrote: >> --- >> src/mesa/drivers/dri/i965/brw_ir_vec4.h | 1 + >> src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp | 15 ++- >> src/mesa/dr

[Mesa-dev] [PATCHv2 32/32] i965: Don't compact instructions with unmapped bits.

2015-02-09 Thread Francisco Jerez
Some instruction bits don't have a mapping defined to any compacted instruction field. If they're ever set and we end up compacting the instruction they will be forced to zero. Avoid using compaction in such cases. v2: Align multiple lines of an expression to the same column. Change conditi

[Mesa-dev] [PATCHv2 01/32] i965: Factor out virtual GRF allocation to a separate object.

2015-02-09 Thread Francisco Jerez
Right now virtual GRF book-keeping and allocation is performed in each visitor class separately (among other hundred different things), leading to duplicated logic in each visitor and preventing layering as it forces any code that manipulates i965 IR and needs to allocate virtual registers to depen

[Mesa-dev] [PATCH] mesa: Rename the CEILING() macro to DIV_ROUND_UP().

2015-02-09 Thread Francisco Jerez
Some people have complained that code using the CEILING() macro is difficult to understand because it's not immediately obvious what it is supposed to do until you go and look up its definition. Use a more descriptive name that matches the similar utility macro in the Linux kernel. --- src/mesa/d

[Mesa-dev] [PATCH] i965/fs: Fix fs_inst::regs_written calculation for instructions with scalar dst.

2015-02-09 Thread Francisco Jerez
Scalar registers are required to have zero stride, fix the regs_written calculation not to assume that the instruction writes zero registers in that case. v2: Rename CEILING() to DIV_ROUND_UP(). (Matt, Ken) Reviewed-by: Kenneth Graunke --- src/mesa/drivers/dri/i965/brw_fs.cpp | 3 ++- 1 file ch

[Mesa-dev] [PATCH] mesa: Bump MAX_IMAGE_UNIFORMS to 32.

2015-02-09 Thread Francisco Jerez
So the i965 driver can expose 32 image uniforms per shader stage. --- src/mesa/main/config.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/mesa/main/config.h b/src/mesa/main/config.h index 4ec4b75..08e1a14 100644 --- a/src/mesa/main/config.h +++ b/src/mesa/main/config.h @

Re: [Mesa-dev] [PATCH 02/14] i965: Allocate binding table space for shader images.

2015-02-09 Thread Francisco Jerez
Kenneth Graunke writes: > On Saturday, February 07, 2015 03:03:44 AM Francisco Jerez wrote: >> Kenneth Graunke writes: >> >> > On Friday, February 06, 2015 07:23:16 PM Francisco Jerez wrote: >> >> Reviewed-by: Paul Berry >> >> --- >

[Mesa-dev] [PATCH] i965: Don't tile 1D miptrees.

2015-02-09 Thread Francisco Jerez
It doesn't really improve locality of texture fetches, quite the opposite it's a waste of memory bandwidth and space due to tile alignment. v2: Check mt->logical_height0 instead of mt->target (Ken). Add short comment explaining why they shouldn't be tiled. --- src/mesa/drivers/dri/i965/intel

Re: [Mesa-dev] [PATCH 14/14] i965/gen7-8: Implement glMemoryBarrier().

2015-02-09 Thread Francisco Jerez
Kristian Høgsberg writes: > On Fri, Feb 6, 2015 at 9:23 AM, Francisco Jerez wrote: >> --- >> src/mesa/drivers/dri/i965/brw_program.c | 40 >> + >> src/mesa/drivers/dri/i965/intel_reg.h | 1 + >> 2 files changed, 41 insertio

Re: [Mesa-dev] [PATCH 11/14] i965/gen7: Enable fragment shader dispatch if the program has image uniforms.

2015-02-09 Thread Francisco Jerez
Kenneth Graunke writes: > On Friday, February 06, 2015 07:23:25 PM Francisco Jerez wrote: >> Shaders with image uniforms may have side effects. Make sure that >> fragment shader threads are dispatched if the shader has any image >> uniforms. >> --- >> src/mesa

[Mesa-dev] [PATCHv2 11/14] i965/gen7: Enable fragment shader dispatch if the program has image uniforms.

2015-02-09 Thread Francisco Jerez
Shaders with image uniforms may have side effects. Make sure that fragment shader threads are dispatched if the shader has any image uniforms. v2: Use brw_stage_state::nr_image_params to find out if the shader has image uniforms instead of checking core mesa data structures (Ken). --- src/me

[Mesa-dev] [PATCHv2 12/14] i965/gen7-8: Poke the 3DSTATE UAV access enable bits.

2015-02-09 Thread Francisco Jerez
v2: Set the PS UAV-only bit on HSW (Ken). --- src/mesa/drivers/dri/i965/brw_defines.h | 4 src/mesa/drivers/dri/i965/gen7_gs_state.c | 4 +++- src/mesa/drivers/dri/i965/gen7_vs_state.c | 13 - src/mesa/drivers/dri/i965/gen7_wm_state.c | 9 + src/mesa/drivers/dri/i965/

[Mesa-dev] [PATCHv2 13/14] i965/gen7-8: Set up early depth/stencil control appropriately for image load/store.

2015-02-09 Thread Francisco Jerez
v2: Store early fragment test mode in brw_wm_prog_data instead of getting it from core mesa data structures (Ken). --- src/mesa/drivers/dri/i965/brw_context.h | 1 + src/mesa/drivers/dri/i965/brw_defines.h | 3 +++ src/mesa/drivers/dri/i965/brw_wm.c | 2 ++ src/mesa/drivers

Re: [Mesa-dev] [PATCHv2 32/32] i965: Don't compact instructions with unmapped bits.

2015-02-09 Thread Francisco Jerez
Matt Turner writes: > On Mon, Feb 9, 2015 at 6:08 AM, Francisco Jerez wrote: >> Some instruction bits don't have a mapping defined to any compacted >> instruction field. If they're ever set and we end up compacting the >> instruction they will be forced to zero.

[Mesa-dev] [PATCHv3 32/32] i965: Don't compact instructions with unmapped bits.

2015-02-09 Thread Francisco Jerez
Some instruction bits don't have a mapping defined to any compacted instruction field. If they're ever set and we end up compacting the instruction they will be forced to zero. Avoid using compaction in such cases. v2: Align multiple lines of an expression to the same column. Change conditi

[Mesa-dev] [PATCHv4 32/32] i965: Don't compact instructions with unmapped bits.

2015-02-10 Thread Francisco Jerez
Some instruction bits don't have a mapping defined to any compacted instruction field. If they're ever set and we end up compacting the instruction they will be forced to zero. Avoid using compaction in such cases. v2: Align multiple lines of an expression to the same column. Change conditi

Re: [Mesa-dev] [PATCH] i965: Use new/delete instead of realloc() in brw_ir_allocator.h

2015-02-11 Thread Francisco Jerez
s = new unsigned[capacity]; > +memcpy(tmp_offsets, offsets, count * sizeof(unsigned)); > +delete[] offsets; > + offsets = tmp_offsets; > } > > sizes[count] = size; > -- > 1.8.5.1 Looks OK to me, Re

Re: [Mesa-dev] [PATCH] i965: Use new/delete instead of realloc() in brw_ir_allocator.h

2015-02-11 Thread Francisco Jerez
Matt Turner writes: > On Wed, Feb 11, 2015 at 6:37 AM, Juha-Pekka Heikkila > wrote: >> There is no error path available thus instead of giving >> realloc possibility to fail use new which will never >> return null pointer and throws bad_alloc on failure. > > The problem was that we weren't check

Re: [Mesa-dev] [PATCH] i965: Use new/delete instead of realloc() in brw_ir_allocator.h

2015-02-11 Thread Francisco Jerez
Matt Turner writes: > On Wed, Feb 11, 2015 at 9:16 AM, Francisco Jerez > wrote: >> Matt Turner writes: >> >>> On Wed, Feb 11, 2015 at 6:37 AM, Juha-Pekka Heikkila >>> wrote: >>>> There is no error path available thus instead of giving >>&

Re: [Mesa-dev] [PATCH] i965: Use new/delete instead of realloc() in brw_ir_allocator.h

2015-02-11 Thread Francisco Jerez
Matt Turner writes: >[...] > Indeed. And another thing to consider is that we've discussed > compiling with -fno-exceptions. > Heh, the benefit you get from doing that is virtually zero. And in cases like this where failure would have to be handled many levels up in the stack and require redesig

Re: [Mesa-dev] [PATCH 0/7] i965 L3 caching and pull constant improvements.

2015-02-12 Thread Francisco Jerez
Francisco Jerez writes: > Kenneth Graunke writes: > >> On Sunday, January 18, 2015 01:04:02 AM Francisco Jerez wrote: >>> This is the first part of a series meant to improve our usage of the L3 >>> cache. >>> Currently it's far from ideal s

Re: [Mesa-dev] [PATCH 21/32] i965/vec4: Fix the scheduler to take into account reads and writes of multiple registers.

2015-02-13 Thread Francisco Jerez
Matt Turner writes: > On Fri, Feb 6, 2015 at 6:43 AM, Francisco Jerez wrote: >> --- > > We don't have any operations today that return more than a single > register in the vec4 backend, do we? Presumably this is partly > preparation for image_load_store? > Yeah,

Re: [Mesa-dev] [PATCH 01/32] i965: Factor out virtual GRF allocation to a separate object.

2015-02-13 Thread Francisco Jerez
Matt Turner writes: > On Mon, Feb 9, 2015 at 11:25 AM, Matt Turner wrote: >> On Fri, Feb 6, 2015 at 2:40 PM, Matt Turner wrote: >>> 8 - Sent a question >>> 9 - Like mine better? >>> 10 - Looks wrong to me >>> 11-13 - Asked Jason to review >>> 14 - Asked for an example showing the problem >>> 15

Re: [Mesa-dev] [PATCH 01/32] i965: Factor out virtual GRF allocation to a separate object.

2015-02-13 Thread Francisco Jerez
Francisco Jerez writes: > Matt Turner writes: > >> On Mon, Feb 9, 2015 at 11:25 AM, Matt Turner wrote: >>> On Fri, Feb 6, 2015 at 2:40 PM, Matt Turner wrote: >>>> 8 - Sent a question >>>> 9 - Like mine better? >>>> 10 - Looks wrong to

[Mesa-dev] [PATCH] i965/vec4: Calculate register allocation q values manually.

2015-02-13 Thread Francisco Jerez
This fixes a regression in the running time of Piglit introduced by commit 78e9043475d4bed8b50f7e413963c960fa0935bb, which increased the number of register allocation classes set up by the VEC4 back-end from 2 to 16. The algorithm used by ra_set_finalize() to calculate them is unnecessarily expens

[Mesa-dev] [PATCH] i965/vec4: Override destination register writemask in sampler message send.

2015-02-13 Thread Francisco Jerez
This line was removed by accident in commit 16b911257440afbd77a6eb762e28df62e3c19bc7 causing a regression in the ES3-CTS.gtf.GL3Tests.shadow.shadow_execution_vert Khronos conformance test. It's necessary because the swizzle_result() code below expects all four components of the vector to be valid.

Re: [Mesa-dev] [PATCH] i965/vec4: Calculate register allocation q values manually.

2015-02-13 Thread Francisco Jerez
Connor Abbott writes: > I'll ask the same question I asked Jason when he did this for FS... > did you verify that the new q_values is the same as the old one? > Yeah, I did. > On Fri, Feb 13, 2015 at 8:02 AM, Francisco Jerez > wrote: >> This fixes a regression in

Re: [Mesa-dev] [PATCH] i965/simd8vs: Fix SIMD8 atomics

2015-02-16 Thread Francisco Jerez
emit(MOV(component(sources[0], 7), brw_flag_reg(0, 1))) > ->force_writemask_all = true; > + } else if (stage == MESA_SHADER_VERTEX) { > + emit(MOV(component(sources[0], 7), > + brw_imm_ud(0xff)))->force_

Re: [Mesa-dev] [PATCH] i965/vec4: Override destination register writemask in sampler message send.

2015-02-16 Thread Francisco Jerez
Hi Ian, Ian Romanick writes: > Please tag the commit with > > Cc: "10.5" > I don't think that's necessary, the commit that caused this regression isn't part of 10.5. > On 02/13/2015 05:03 AM, Francisco Jerez wrote: >&g

Re: [Mesa-dev] [PATCH] i965/simd8vs: Fix SIMD8 atomics

2015-02-16 Thread Francisco Jerez
Jason Ekstrand writes: > On Feb 15, 2015 11:55 PM, "Ben Widawsky" > wrote: >> >> The short version: we need to set bits in R0.7 which provide a mask to be > used >> for PS kill samples/pixels. Since the VS has no such concept, we just > need to >> set all 1. >> >> The longer version... >> Execut

[Mesa-dev] [PATCH] ra: Disable round-robin strategy for optimistically colorable nodes.

2015-02-16 Thread Francisco Jerez
The round-robin allocation strategy is expected to decrease the amount of false dependencies created by the register allocator and give the post-RA scheduling pass more freedom to move instructions around. On the other hand it has the disadvantage of increasing fragmentation and decreasing the num

Re: [Mesa-dev] [PATCH] ra: Disable round-robin strategy for optimistically colorable nodes.

2015-02-16 Thread Francisco Jerez
Jason Ekstrand writes: > On Feb 16, 2015 8:35 AM, "Francisco Jerez" wrote: >> >> The round-robin allocation strategy is expected to decrease the amount >> of false dependencies created by the register allocator and give the >> post-RA scheduling pass mor

Re: [Mesa-dev] [PATCH] ra: Disable round-robin strategy for optimistically colorable nodes.

2015-02-16 Thread Francisco Jerez
Jason Ekstrand writes: > On Feb 16, 2015 9:34 AM, "Francisco Jerez" wrote: >> >> Jason Ekstrand writes: >> >> > On Feb 16, 2015 8:35 AM, "Francisco Jerez" > wrote: >> >> >> >> The round-robin allocation strategy

Re: [Mesa-dev] [PATCH] ra: Disable round-robin strategy for optimistically colorable nodes.

2015-02-16 Thread Francisco Jerez
Matt Turner writes: > On Mon, Feb 16, 2015 at 10:40 AM, Francisco Jerez > wrote: >> My intuition is that the huge performance improvement Matt observed by >> disabling the third scheduling heuristic is more likely to have been >> caused by a decrease in the amount of c

Re: [Mesa-dev] [PATCH] ra: Disable round-robin strategy for optimistically colorable nodes.

2015-02-16 Thread Francisco Jerez
Jason Ekstrand writes: > On Mon, Feb 16, 2015 at 10:40 AM, Francisco Jerez > wrote: > >> Jason Ekstrand writes: >> >> > On Feb 16, 2015 9:34 AM, "Francisco Jerez" >> wrote: >> >> >> >> Jason Ekstrand writes

Re: [Mesa-dev] [PATCH] ra: Disable round-robin strategy for optimistically colorable nodes.

2015-02-17 Thread Francisco Jerez
lead to any measurable improvement in any of the other cases (actually I would be very surprised if that's the case), so this suggestion seems a bit of a premature optimization to me, how about we KISS for now. > > > On Mon, Feb 16, 2015 at 11:39 AM, Francisco Jerez > wrote:

[Mesa-dev] [PATCHv2] ra: Disable round-robin strategy for optimistically colorable nodes.

2015-02-17 Thread Francisco Jerez
The round-robin allocation strategy is expected to decrease the amount of false dependencies created by the register allocator and give the post-RA scheduling pass more freedom to move instructions around. On the other hand it has the disadvantage of increasing fragmentation and decreasing the num

Re: [Mesa-dev] [PATCHv2] ra: Disable round-robin strategy for optimistically colorable nodes.

2015-02-17 Thread Francisco Jerez
Tom Stellard writes: > On Tue, Feb 17, 2015 at 03:23:05PM +0200, Francisco Jerez wrote: >> The round-robin allocation strategy is expected to decrease the amount >> of false dependencies created by the register allocator and give the >> post-RA scheduling pass more freedom

Re: [Mesa-dev] [PATCH] ra: Disable round-robin strategy for optimistically colorable nodes.

2015-02-17 Thread Francisco Jerez
Jason Ekstrand writes: > On Mon, Feb 16, 2015 at 11:39 AM, Francisco Jerez > wrote: > >> The round-robin allocation strategy is expected to decrease the amount >> of false dependencies created by the register allocator and give the >> post-RA scheduling pass more f

Re: [Mesa-dev] [PATCH] ra: Disable round-robin strategy for optimistically colorable nodes.

2015-02-17 Thread Francisco Jerez
Connor Abbott writes: > On Tue, Feb 17, 2015 at 8:15 AM, Francisco Jerez > wrote: >> Connor Abbott writes: >> >>> Hi Francisco, >>> >> Hi Connor, and thank you for your feedback. >> >>> A few comments: >>> >>> 1

Re: [Mesa-dev] [PATCH] i965/simd8vs: Fix SIMD8 atomics (read-only)

2015-02-18 Thread Francisco Jerez
e will fix cases that write atomics, such as > atomicCounterIncrement, and this change will fix cases than only read > atomics, such as atomicCounter. > > Signed-off-by: Jordan Justen > Cc: Ben Widawsky > Cc: Francisco Jerez > --- > src/mesa/drivers/dri/i965/brw_fs_visito

Re: [Mesa-dev] [PATCH] ra: Disable round-robin strategy for optimistically colorable nodes.

2015-02-18 Thread Francisco Jerez
Connor Abbott writes: > On Tue, Feb 17, 2015 at 3:04 PM, Francisco Jerez > wrote: >> Jason Ekstrand writes: >> >>> On Mon, Feb 16, 2015 at 11:39 AM, Francisco Jerez >>> wrote: >>> >>>> The round-robin allocation strategy is expected

Re: [Mesa-dev] [PATCH 10/32] i965/fs: Remove logic to keep track of MRF metadata in lower_load_payload().

2015-02-19 Thread Francisco Jerez
Jason Ekstrand writes: > On Fri, Feb 6, 2015 at 4:01 PM, Francisco Jerez > wrote: > >> Hey Matt, >> >> Matt Turner writes: >> >> > On Fri, Feb 6, 2015 at 6:42 AM, Francisco Jerez >> wrote: >> >> MRFs cannot be read from anyway so

Re: [Mesa-dev] [PATCH 10/32] i965/fs: Remove logic to keep track of MRF metadata in lower_load_payload().

2015-02-19 Thread Francisco Jerez
Jason Ekstrand writes: > On Thu, Feb 19, 2015 at 12:13 PM, Francisco Jerez > wrote: > >> Jason Ekstrand writes: >> >> > On Fri, Feb 6, 2015 at 4:01 PM, Francisco Jerez >> > wrote: >> > >> >> Hey Matt, >> >> >> &

Re: [Mesa-dev] [PATCH] i965/fs: Set pixel/sample mask for compute shaders atomic ops

2015-02-20 Thread Francisco Jerez
s, so we set all bits to enabled. >> > >> > Note: this mask is ANDed with the execution mask, so some channels may not >> > end >> > up issuing the atomic operation. >> > >> > Signed-off-by: Jordan Justen >> > Cc: Ben Widawsky >>

Re: [Mesa-dev] [PATCH 10/32] i965/fs: Remove logic to keep track of MRF metadata in lower_load_payload().

2015-02-20 Thread Francisco Jerez
v/2015-February/076097.html [4] http://lists.freedesktop.org/archives/mesa-dev/2015-February/076098.html > --Jason > > On Thu, Feb 19, 2015 at 1:53 PM, Jason Ekstrand > wrote: > >> >> >> On Thu, Feb 19, 2015 at 1:25 PM, Francisco Jerez >> wrote: >>

Re: [Mesa-dev] [PATCH 10/32] i965/fs: Remove logic to keep track of MRF metadata in lower_load_payload().

2015-02-20 Thread Francisco Jerez
Jason Ekstrand writes: > On Fri, Feb 20, 2015 at 4:11 AM, Francisco Jerez > wrote: > >> Jason Ekstrand writes: >> >> > I'm still a little pensive. But >> > >> > Reviewed-by: Jason Ekstrand >> > >> Thanks. >> >>

Re: [Mesa-dev] [PATCH 12/32] i965/fs: Fix lower_load_payload() to take into account stride in the metadata guess.

2015-02-20 Thread Francisco Jerez
image_load_store code that ends up using byte types with stride=4 for some image formats. > > On Fri, Feb 6, 2015 at 9:42 AM, Francisco Jerez > wrote: > >> --- >> src/mesa/drivers/dri/i965/brw_fs.cpp | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >

Re: [Mesa-dev] [PATCH 10/32] i965/fs: Remove logic to keep track of MRF metadata in lower_load_payload().

2015-02-20 Thread Francisco Jerez
Jason Ekstrand writes: > On Fri, Feb 20, 2015 at 1:09 PM, Francisco Jerez > wrote: > >> Jason Ekstrand writes: >> >> > On Fri, Feb 20, 2015 at 4:11 AM, Francisco Jerez >> > wrote: >> > >> >> Jason Ekstrand writes: >> >&

Re: [Mesa-dev] [PATCH 10/32] i965/fs: Remove logic to keep track of MRF metadata in lower_load_payload().

2015-02-20 Thread Francisco Jerez
;t be handled in roughly the same way you do it now? Recognize when src[i + 4] is the same 8-wide register as src[i] shifted by 8 and emit a COMPR4 copy in that case? > On Fri, Feb 20, 2015 at 2:10 PM, Jason Ekstrand > wrote: > >> >> >> On Fri, Feb 20, 2015 at 1:09 PM, Fra

[Mesa-dev] [PATCH 7/7] i965: Fix variable indexing of sampler arrays under non-uniform control flow.

2015-02-20 Thread Francisco Jerez
ARB_gpu_shader5 requires sampler array indexing expressions to be dynamically uniform, this however doesn't have any implications on the control flow that leads to the evaluation of that expression being uniform. Use emit_uniformize() to obtain an arbitrary live value from the binding table index

[Mesa-dev] [PATCH 5/7] i965: Define helper function to copy an arbitrary live component from some register.

2015-02-20 Thread Francisco Jerez
--- src/mesa/drivers/dri/i965/brw_fs.h | 2 ++ src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 12 src/mesa/drivers/dri/i965/brw_vec4.h | 3 +++ src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 11 +++ 4 files changed, 28 insertions(+) diff --git a/sr

[Mesa-dev] [PATCH 1/7] i965: Introduce the BROADCAST pseudo-opcode.

2015-02-20 Thread Francisco Jerez
The BROADCAST instruction picks the channel from its first source given by an index passed in as second source. This will be used in situations where all channels from the same SIMD thread have to agree on the value of something, e.g. a surface binding table index. --- src/mesa/drivers/dri/i965/b

[Mesa-dev] [PATCH 4/7] i965: Perform basic optimizations on the FIND_LIVE_CHANNEL opcode.

2015-02-20 Thread Francisco Jerez
--- src/mesa/drivers/dri/i965/brw_fs.cpp | 49 ++ src/mesa/drivers/dri/i965/brw_fs.h | 1 + src/mesa/drivers/dri/i965/brw_fs_cse.cpp | 1 + src/mesa/drivers/dri/i965/brw_vec4.cpp | 41 + src/mesa/drivers/dri/i965/brw_vec4.h

[Mesa-dev] [PATCH 6/7] i965: Fix variable indexing of UBO arrays under non-uniform control flow.

2015-02-20 Thread Francisco Jerez
ARB_gpu_shader5 requires UBO array indexing expressions to be dynamically uniform, this however doesn't have any implications on the control flow that leads to the evaluation of that expression being uniform. Use emit_uniformize() to obtain an arbitrary live value from the binding table index calc

[Mesa-dev] [PATCH 2/7] i965: Perform basic optimizations on the BROADCAST opcode.

2015-02-20 Thread Francisco Jerez
--- src/mesa/drivers/dri/i965/brw_fs.cpp| 15 +++ src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp | 1 + src/mesa/drivers/dri/i965/brw_fs_cse.cpp| 1 + src/mesa/drivers/dri/i965/brw_ir_fs.h | 7 +++ src/mesa/drivers/d

[Mesa-dev] [PATCH 3/7] i965: Introduce the FIND_LIVE_CHANNEL pseudo-opcode.

2015-02-20 Thread Francisco Jerez
This instruction calculates the index of an arbitrary channel enabled in the current execution mask. It's expected to be used as input for the BROADCAST opcode, but it's implemented as a separate instruction rather than being baked into BROADCAST because FIND_LIVE_CHANNEL has no dependencies so it

Re: [Mesa-dev] [PATCH 1/7] i965: Introduce the BROADCAST pseudo-opcode.

2015-02-20 Thread Francisco Jerez
The same opcodes will be used for dynamically uniform indexing of image arrays too. > On 02/20/2015 11:48 AM, Francisco Jerez wrote: >> The BROADCAST instruction picks the channel from its first source >> given by an index passed in as second source. This will be used in >

Re: [Mesa-dev] [PATCH 1/3] clover: Don't unconditionally define cl_khr_fp64

2015-02-27 Thread Francisco Jerez
Tom Stellard writes: > This should be done by the frontend for devices that support this > extension. Reviewed-by: Francisco Jerez > --- > src/gallium/state_trackers/clover/llvm/invocation.cpp | 1 - > 1 file changed, 1 deletion(-) > > diff --git a/src/gallium/state

[Mesa-dev] [PATCH 04/13] i965: Mask out unused Align16 components in brw_untyped_atomic.

2015-02-27 Thread Francisco Jerez
This is currently not a problem because the vec4 visitor happens to mask out unused components from the destination, but it might become an issue when we start using atomics without writeback message. In any case it seems sensible to set it again here because the consequences of setting the wrong

[Mesa-dev] [PATCH 07/13] i965: Don't request untyped atomic writeback message if the destination is null.

2015-02-27 Thread Francisco Jerez
--- src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 2 +- src/mesa/drivers/dri/i965/brw_vec4_generator.cpp | 3 ++- 2 files changed, 3 insertions(+), 2 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp index 4e695f5..48ee

[Mesa-dev] [PATCH 05/13] i965: Fix the untyped surface opcodes to deal with indirect surface access.

2015-02-27 Thread Francisco Jerez
Change brw_untyped_atomic() and brw_untyped_surface_read() to take the surface index as a register instead of a constant and to use brw_send_indirect_message() to emit the indirect variant of send with a dynamically calculated message descriptor. This will be required to support variable indexing

[Mesa-dev] [PATCH 13/13] i965: Add memory fence opcode.

2015-02-27 Thread Francisco Jerez
--- src/mesa/drivers/dri/i965/brw_defines.h | 2 + src/mesa/drivers/dri/i965/brw_eu.h | 4 ++ src/mesa/drivers/dri/i965/brw_eu_emit.c | 70 src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 4 ++ src/mesa/drivers/dri/i965/brw_shader.cpp

[Mesa-dev] [PATCH 09/13] i965: Pass the number of components as a source of the untyped surface read opcode.

2015-02-27 Thread Francisco Jerez
--- src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 5 +++-- src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 2 +- src/mesa/drivers/dri/i965/brw_vec4_generator.cpp | 6 -- src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 3 ++- 4 files changed, 10 insertions(+), 6 deletions(-) diff --

[Mesa-dev] [PATCH 10/13] i965: Reorder sources of the untyped atomic opcode.

2015-02-27 Thread Francisco Jerez
This is consistent with the untyped surface read opcode. From now on all typed and untyped surface access opcodes will follow the same pattern: src[0] will be the message payload, src[1] will be the surface index and src[2] will be a control immediate (atomic operation for atomic opcodes and numbe

[Mesa-dev] [PATCH 08/13] i965/vec4: Add support for untyped surface message sends from GRF.

2015-02-27 Thread Francisco Jerez
This doesn't actually enable untyped surface message sends from GRF yet, the upcoming atomic counter and image intrinsic lowering code will. --- src/mesa/drivers/dri/i965/brw_vec4.cpp | 7 --- src/mesa/drivers/dri/i965/brw_vec4_generator.cpp | 16 +++- src/mesa/drivers/d

[Mesa-dev] [PATCH 03/13] i965: Pass number of components explicitly to brw_untyped_atomic and _surface_read.

2015-02-27 Thread Francisco Jerez
And calculate the message response size based on the number of components rather than the other way around. This simplifies their interface somewhat and allows the caller to request a writeback message with more than one vector component in SIMD4x2 mode. --- src/mesa/drivers/dri/i965/brw_eu.h

[Mesa-dev] [PATCH 02/13] i965: Don't disable exec masking for sampler message sends.

2015-02-27 Thread Francisco Jerez
This was telling the sampler to do texture fetches for *all* channels in the non-constant surface index case, what could have reduced throughput unnecessarily when some of the channels were disabled by control flow. --- src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 12 ++-- src/mesa/d

[Mesa-dev] [PATCH 06/13] i965: Simplify generator code for untyped surface messages.

2015-02-27 Thread Francisco Jerez
The generate_untyped_*() methods do nothing useful other than calling the corresponding function from brw_eu_emit.c. The calls to brw_mark_surface_used() will go away too in a future commit. --- src/mesa/drivers/dri/i965/brw_fs.h | 11 -- src/mesa/drivers/dri/i965/brw_fs_generat

[Mesa-dev] [PATCH 01/13] i965: Factor out logic to build a send message instruction with indirect descriptor.

2015-02-27 Thread Francisco Jerez
--- src/mesa/drivers/dri/i965/brw_eu.h | 19 ++-- src/mesa/drivers/dri/i965/brw_eu_emit.c | 58 ++-- src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 55 +- src/mesa/drivers/dri/i965/brw_vec4_generator.cpp | 37 --- 4

[Mesa-dev] [PATCH 12/13] i965: Add typed surface access opcodes.

2015-02-27 Thread Francisco Jerez
--- src/mesa/drivers/dri/i965/brw_defines.h| 4 + src/mesa/drivers/dri/i965/brw_eu.h | 24 +++ src/mesa/drivers/dri/i965/brw_eu_emit.c| 169 + src/mesa/drivers/dri/i965/brw_fs.cpp | 12 ++ src/mesa/drivers/dri/i965/brw_f

[Mesa-dev] [PATCH 11/13] i965: Add untyped surface write opcode.

2015-02-27 Thread Francisco Jerez
--- src/mesa/drivers/dri/i965/brw_defines.h| 1 + src/mesa/drivers/dri/i965/brw_eu.h | 7 +++ src/mesa/drivers/dri/i965/brw_eu_emit.c| 51 ++ src/mesa/drivers/dri/i965/brw_fs.cpp | 4 ++ src/mesa/drivers/dri/i965/brw_fs_g

Re: [Mesa-dev] [PATCH 1/2] clover: Report a default value for CL_DEVICE_SINGLE_FP_CONFIG

2015-03-02 Thread Francisco Jerez
Tom Stellard writes: > --- > src/gallium/state_trackers/clover/api/device.cpp | 3 +-- > src/gallium/state_trackers/clover/core/device.cpp | 6 ++ > src/gallium/state_trackers/clover/core/device.hpp | 1 + > 3 files changed, 8 insertions(+), 2 deletions(-) > > diff --git a/src/gallium/state

Re: [Mesa-dev] [PATCH] clover: Enable cl_khr_fp64 for devices that support doubles v4

2015-03-05 Thread Francisco Jerez
ice query function from cl_khr_fp86() to > has_doubles(). > > v3: > - Return 0 for device::doubled_fp_confg() when doubles aren't > supported. > > v4: > - Remove device query for double fp_config. Reviewed-by: Francisco Jerez > --- > s

Re: [Mesa-dev] [PATCH] clover: Return the minimum required value for CL_DEVICE_SINGLE_FP_CONFIG

2015-03-05 Thread Francisco Jerez
Tom Stellard writes: > This means dropping CL_FP_DENORM from the current return value. > --- > src/gallium/state_trackers/clover/api/device.cpp | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > diff --git a/src/gallium/state_trackers/clover/api/device.cpp > b/src/gallium/state_trac

Re: [Mesa-dev] [PATCH 01/13] i965: Factor out logic to build a send message instruction with indirect descriptor.

2015-03-06 Thread Francisco Jerez
"Pohjolainen, Topi" writes: > On Fri, Mar 06, 2015 at 10:37:06AM +0200, Pohjolainen, Topi wrote: >> On Fri, Feb 27, 2015 at 05:34:44PM +0200, Francisco Jerez wrote: >> >[..] >> > +/** >> > + * Send message to shared unit \p sfid with a possibly indir

Re: [Mesa-dev] [PATCH 05/13] i965: Fix the untyped surface opcodes to deal with indirect surface access.

2015-03-06 Thread Francisco Jerez
"Pohjolainen, Topi" writes: > On Fri, Feb 27, 2015 at 05:34:48PM +0200, Francisco Jerez wrote: >> Change brw_untyped_atomic() and brw_untyped_surface_read() to take the >> surface index as a register instead of a constant and to use >> brw_send_indirect_message()

Re: [Mesa-dev] [PATCH 05/13] i965: Fix the untyped surface opcodes to deal with indirect surface access.

2015-03-06 Thread Francisco Jerez
"Pohjolainen, Topi" writes: > On Fri, Mar 06, 2015 at 02:29:15PM +0200, Francisco Jerez wrote: >> "Pohjolainen, Topi" writes: >> >> > On Fri, Feb 27, 2015 at 05:34:48PM +0200, Francisco Jerez wrote: >> >> Change brw_untyped_atomic()

Re: [Mesa-dev] [PATCH 04/13] i965: Mask out unused Align16 components in brw_untyped_atomic.

2015-03-06 Thread Francisco Jerez
"Pohjolainen, Topi" writes: > On Fri, Feb 27, 2015 at 05:34:47PM +0200, Francisco Jerez wrote: >> This is currently not a problem because the vec4 visitor happens to >> mask out unused components from the destination, but it might become >> an issue when we start

<    1   2   3   4   5   6   7   8   9   10   >