[Mesa-dev] [PATCH 37/47] clover: Override ret_object.

2016-07-03 Thread Francisco Jerez
From: Serge Martin Return an API object from an intrusive reference to a Clover object, incrementing the reference count of the object. Reviewed-by: Francisco Jerez --- src/gallium/state_trackers/clover/api/util.hpp | 11 +++ 1 file changed, 11 insertions(+) diff --git a/src/gallium

[Mesa-dev] [PATCH 23/47] clover/llvm: Use metadata introspection utils for kernel enumeration.

2016-07-03 Thread Francisco Jerez
Reviewed-by: Serge Martin --- .../state_trackers/clover/llvm/invocation.cpp | 34 ++ 1 file changed, 3 insertions(+), 31 deletions(-) diff --git a/src/gallium/state_trackers/clover/llvm/invocation.cpp b/src/gallium/state_trackers/clover/llvm/invocation.cpp index afe621d

[Mesa-dev] [PATCH 43/47] clover/llvm: Get rid of compile_program_llvm().

2016-07-03 Thread Francisco Jerez
Superseded by compile_program() and link_program(). Reviewed-by: Serge Martin --- src/gallium/state_trackers/clover/core/compiler.hpp | 7 --- src/gallium/state_trackers/clover/llvm/invocation.cpp | 11 --- 2 files changed, 18 deletions(-) diff --git a/src/gallium/state_trackers/

Re: [Mesa-dev] [PATCH 20/47] clover/llvm: Clean up codestyle of get_kernel_args().

2016-07-04 Thread Francisco Jerez
Jan Vesely writes: > On Sun, 2016-07-03 at 17:51 -0700, Francisco Jerez wrote: >> Reviewed-by: Serge Martin >> --- >>  .../state_trackers/clover/llvm/invocation.cpp  | 223 >> ++--- >>  1 file changed, 103 insertions(+), 120 deletions(-

Re: [Mesa-dev] [PATCH 10/47] clover/llvm: Clean up compilation into LLVM IR.

2016-07-04 Thread Francisco Jerez
Jan Vesely writes: > On Sun, 2016-07-03 at 17:51 -0700, Francisco Jerez wrote: >> Some assorted and mostly trivial clean-ups for the source to bitcode >> compilation path. >> >> Reviewed-by: Serge Martin >> --- >>  .../state_tracke

Re: [Mesa-dev] [PATCH 4/6] i965/fs: do not require force_writemask_all with exec_size 4

2016-07-07 Thread Francisco Jerez
group with nibble granularity at best it's unpractical to split instructions into chunks of execution size less than four. SIMD4 though definitely makes sense because of FP64. Either way patch is: Reviewed-by: Francisco Jerez >assert(inst->force_writemask_all || inst->

Re: [Mesa-dev] [PATCH 5/6] i965/fs: do pack lowering before simd splitting

2016-07-07 Thread Francisco Jerez
Samuel Iglesias Gonsálvez writes: > From: Iago Toral Quiroga > > So that we can have gen7 split large writes produced by the pack lowering. Reviewed-by: Francisco Jerez > --- > src/mesa/drivers/dri/i965/brw_fs.cpp | 10 +- > 1 file changed, 5 insertions(+), 5 del

Re: [Mesa-dev] [PATCH 1/6] i965/fs: add a helper function to create double immediates

2016-07-07 Thread Francisco Jerez
Samuel Iglesias Gonsálvez writes: > From: Iago Toral Quiroga > > Gen7 hardware does not support double immediates so these need > to be moved in 32-bit chunks to a regular vgrf instead. Instead > of doing this every time we need to create a DF immediate, > create a helper function that does the

Re: [Mesa-dev] [PATCH 3/6] i965/fs/gen7: split instructions that run into exec masking bugs

2016-07-07 Thread Francisco Jerez
Samuel Iglesias Gonsálvez writes: > From: Iago Toral Quiroga > > In fp64 we can produce code like this: > > mov(16) vgrf2<2>:UD, vgrf3<2>:UD > > That our simd lowering pass would typically split in instructions with a > width of 8, writing to two consecutive registers each. Unfortunately, gen7 >

Re: [Mesa-dev] [PATCH 6/6] i965/fs: don't copy propagate if the instruction writes to more than two adjacent GRFs

2016-07-07 Thread Francisco Jerez
Samuel Iglesias Gonsálvez writes: > This is not allowed by the HW and copy propagation can hide this issue to > lower_simd_width pass, which is going to fix it. > > Signed-off-by: Samuel Iglesias Gonsálvez > --- > src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp | 1 + > 1 file changed, 1

Re: [Mesa-dev] [PATCH 3/6] i965/fs/gen7: split instructions that run into exec masking bugs

2016-07-08 Thread Francisco Jerez
Iago Toral writes: > On Thu, 2016-07-07 at 19:36 -0700, Francisco Jerez wrote: >> Samuel Iglesias Gonsálvez writes: >> >> > >> > From: Iago Toral Quiroga >> > >> > In fp64 we can produce code like this: >> > >> > mov(16)

Re: [Mesa-dev] [PATCH 04/47] clover/llvm: Collect #ifdef mess into a separate file.

2016-07-08 Thread Francisco Jerez
Jan Vesely writes: > On Sun, 2016-07-03 at 17:51 -0700, Francisco Jerez wrote: >> This gets rid of most ifdef's from the invocation.cpp code -- Only a >> couple of them are left which will be removed differently in the >> following commits. >> >> Rev

Re: [Mesa-dev] [PATCH 20/47] clover/llvm: Clean up codestyle of get_kernel_args().

2016-07-09 Thread Francisco Jerez
Jan Vesely writes: > On Mon, 2016-07-04 at 12:31 -0700, Francisco Jerez wrote: >> Jan Vesely writes: >> >> > On Sun, 2016-07-03 at 17:51 -0700, Francisco Jerez wrote: >> > > Reviewed-by: Serge Martin >> > > --- >> > &g

Re: [Mesa-dev] [PATCH v2 3/6] i965/fs/gen7: split instructions that run into exec masking bugs

2016-07-11 Thread Francisco Jerez
Samuel Iglesias Gonsálvez writes: > From: Iago Toral Quiroga > > In fp64 we can produce code like this: > > mov(16) vgrf2<2>:UD, vgrf3<2>:UD > > That our simd lowering pass would typically split in instructions with a > width of 8, writing to two consecutive registers each. Unfortunately, gen7 >

Re: [Mesa-dev] [PATCH v2 6/6] i965/fs: do d2x lowering before simd splitting

2016-07-11 Thread Francisco Jerez
Samuel Iglesias Gonsálvez writes: > So that we can have gen7 split large writes produced by this lowering pass. > > Signed-off-by: Samuel Iglesias Gonsálvez Reviewed-by: Francisco Jerez > --- > src/mesa/drivers/dri/i965/brw_fs.cpp | 10 +- > 1 file changed,

Re: [Mesa-dev] [PATCH v2 3/6] i965/fs/gen7: split instructions that run into exec masking bugs

2016-07-11 Thread Francisco Jerez
Francisco Jerez writes: > Samuel Iglesias Gonsálvez writes: > >> From: Iago Toral Quiroga >> >> In fp64 we can produce code like this: >> >> mov(16) vgrf2<2>:UD, vgrf3<2>:UD >> >> That our simd lowering pass would typically split in i

Re: [Mesa-dev] [PATCH v3] i965/fs/gen7: split instructions that run into exec masking bugs

2016-07-12 Thread Francisco Jerez
t(exec_type_size); > + > + /* The hardware shifts exactly 8 channels per compressed half of the > + * instruction in single-precision mode and exactly 4 in > double-precision. > + */ > + if (channels_per_grf != (exec_type_size == 8 ? 4 : 8)) > +

Re: [Mesa-dev] [PATCH] clover: Pass unquoted compiler arguments to Clang

2016-07-12 Thread Francisco Jerez
Vedran Miletić writes: > 06.06.2016 u 12:24, Vedran Miletić je napisao/la: >> On 06/06/2016 02:04 AM, Francisco Jerez wrote: >>> Vedran Miletić writes: >>>> >>>> Aside from working just like NVIDIA and AMD proprietary stacks, no. >>>> >&

Re: [Mesa-dev] [PATCH] main/shaderimage: image unit invalid if texture is incomplete, independently of the level

2016-07-14 Thread Francisco Jerez
Alejandro Piñeiro writes: > Without this commit, a image is considered valid if the level of the > texture bound to the image is complete, something we can check as mesa > save independently if it is "base incomplete" of "mipmap incomplete". > > But, from the OpenGL 4.3 Core Specification, sectio

Re: [Mesa-dev] [PATCH] clover: Pass unquoted compiler arguments to Clang

2016-07-14 Thread Francisco Jerez
Vedran Miletić writes: > On 07/13/2016 12:49 AM, Francisco Jerez wrote: >> You can just replace the current implementation of tokenize(), it's not >> used for anything else other than splitting compiler arguments AFAIK. >> > > Done, patch incoming. > >>&

Re: [Mesa-dev] [PATCH] main/shaderimage: image unit invalid if texture is incomplete, independently of the level

2016-07-15 Thread Francisco Jerez
Alejandro Piñeiro writes: > On 14/07/16 21:24, Francisco Jerez wrote: >> Alejandro Piñeiro writes: >> >>> Without this commit, a image is considered valid if the level of the >>> texture bound to the image is complete, something we can check as mesa >

Re: [Mesa-dev] [PATCH 2/2] clover: Re-order includes in invocation.cpp to fix build

2016-07-20 Thread Francisco Jerez
eaks the build because the member > names of this class are replaced by the literal 1. Reviewed-by: Francisco Jerez > --- > .../state_trackers/clover/llvm/invocation.cpp | 24 > +++--- > 1 file changed, 17 insertions(+), 7 deletions(-) > > diff

Re: [Mesa-dev] [PATCH 1/2] clover: Add missing include v2

2016-07-20 Thread Francisco Jerez
te_trackers/clover/llvm/invocation.cpp > +++ b/src/gallium/state_trackers/clover/llvm/invocation.cpp > @@ -39,6 +39,8 @@ > #include > > #include > +#include > + Redundant whitespace, with that removed: Reviewed-by: Francisco Jerez > #include

[Mesa-dev] [PATCH 00/17] Framebuffer fetch.

2016-07-20 Thread Francisco Jerez
This series implements the driver-independent part of the EXT_shader_framebuffer_fetch extension that provides fully programmable blending to GLES shaders. The GLSL IR and NIR representation of the framebuffer fetch functionality should be straightforward, but the way extension tracking works may

[Mesa-dev] [PATCH 05/17] mesa: Move shader memory barrier functions into barrier.c.

2016-07-20 Thread Francisco Jerez
--- src/mesa/main/barrier.c | 51 + src/mesa/main/barrier.h | 6 ++ src/mesa/main/shaderimage.c | 51 - src/mesa/main/shaderimage.h | 6 -- 4 files changed, 57 insertions(+), 57 deletions(-)

[Mesa-dev] [PATCH 06/17] mesa: Add blend barrier entry point and driver hook.

2016-07-20 Thread Francisco Jerez
Both MESA_shader_framebuffer_fetch_non_coherent and the non-coherent variant of KHR_blend_equation_advanced will use this driver hook to request coherency between framebuffer reads and writes. This intentionally doesn't hook up glBlendBarrierMESA to the dispatch layer since the extension isn't exp

[Mesa-dev] [PATCH 10/17] glsl: Define a gl_LastFragData built-in for GLSL versions that have gl_FragData.

2016-07-20 Thread Francisco Jerez
The EXT_shader_framebuffer_fetch extension defines alternative language for GLES2 shaders where user-defined fragment outputs are not allowed. Instead of using inout user-defined fragment outputs the shader is expected to read from the gl_LastFragData built-in array. In addition this allows using

[Mesa-dev] [PATCH 09/17] glsl: Handle the inout qualifier in fragment shader output declarations.

2016-07-20 Thread Francisco Jerez
According to the EXT_shader_framebuffer_fetch extension the inout qualifier can be used on ESSL 3.0+ shaders to declare a special kind of fragment output that gets implicitly initialized with the previous framebuffer contents at the current fragment coordinates. In addition we allow using the same

[Mesa-dev] [PATCH 04/17] mesa: Rename "texturebarrier" source files to "barrier".

2016-07-20 Thread Francisco Jerez
In preparation for collecting all pipeline barrier GL entry points into a single source file. --- src/mapi/glapi/gen/gl_genexec.py | 2 +- src/mesa/Makefile.sources | 4 ++-- src/mesa/drivers/common/driverfuncs.c | 4 ++-- src/mesa/main/{texturebarrier.c

[Mesa-dev] [PATCH 08/17] glsl: Add support for representing framebuffer fetch in the GLSL IR.

2016-07-20 Thread Francisco Jerez
The GLSL IR representation of framebuffer fetch amounts to a single bit in the ir_variable object applicable to fragment shader outputs. The flag indicates that the variable will be implicitly initialized to the previous contents of the render buffer at the same fragment coordinates and sample inde

[Mesa-dev] [PATCH 02/17] mesa: Add extension enables for framebuffer fetch extensions.

2016-07-20 Thread Francisco Jerez
This allows drivers to expose EXT_shader_framebuffer_fetch in GLES2+ contexts if desired. Note that this adds boolean flags for two MESA extensions, but only the EXT GLES-only extension is exposed for the moment, see the cover letter of this series [1] for the rationale. [1] https://lists.freedes

[Mesa-dev] [PATCH 03/17] mesa: Add support for querying GL_FRAGMENT_SHADER_DISCARDS_SAMPLES_EXT.

2016-07-20 Thread Francisco Jerez
This can currently only give true as result since the only way you can expose EXT_shader_framebuffer_fetch right now is by flipping the MESA_shader_framebuffer_fetch bit, but that could potentially change in the future, see [1] for an explanation. [1] https://lists.freedesktop.org/archives/mesa-de

[Mesa-dev] [PATCH 07/17] glsl: Add parser state enables for the framebuffer fetch extensions.

2016-07-20 Thread Francisco Jerez
--- src/compiler/glsl/glsl_parser_extras.cpp | 3 +++ src/compiler/glsl/glsl_parser_extras.h | 13 + 2 files changed, 16 insertions(+) diff --git a/src/compiler/glsl/glsl_parser_extras.cpp b/src/compiler/glsl/glsl_parser_extras.cpp index 0077434..a2c1afe 100644 --- a/src/compiler/

[Mesa-dev] [PATCH 01/17] glapi: Add XML for GL_EXT_shader_framebuffer_fetch.

2016-07-20 Thread Francisco Jerez
--- src/mapi/glapi/gen/es_EXT.xml | 5 + 1 file changed, 5 insertions(+) diff --git a/src/mapi/glapi/gen/es_EXT.xml b/src/mapi/glapi/gen/es_EXT.xml index 6886dab..929e0e7 100644 --- a/src/mapi/glapi/gen/es_EXT.xml +++ b/src/mapi/glapi/gen/es_EXT.xml @@ -790,6 +790,11 @@ + + + +

[Mesa-dev] [PATCH 12/17] glsl/ast: Allow redeclaration of gl_LastFragData with different precision qualifier.

2016-07-20 Thread Francisco Jerez
--- src/compiler/glsl/ast_to_hir.cpp | 13 + 1 file changed, 13 insertions(+) diff --git a/src/compiler/glsl/ast_to_hir.cpp b/src/compiler/glsl/ast_to_hir.cpp index c050a3f..ac651a9 100644 --- a/src/compiler/glsl/ast_to_hir.cpp +++ b/src/compiler/glsl/ast_to_hir.cpp @@ -3948,6 +3948,1

[Mesa-dev] [PATCH 17/17] nir: Handle FB fetch outputs correctly in nir_lower_io_to_temporaries.

2016-07-20 Thread Francisco Jerez
This requires emitting a series of copies at the top of the program from each output variable to the corresponding temporary. The initial copy can be skipped for non-framebuffer fetch outputs whose initial value is undefined, and the final copy needs to be skipped for read-only outputs (i.e. gl_La

[Mesa-dev] [PATCH 13/17] glsl/linker: Allow fragment output overlap for gl_LastFragData.

2016-07-20 Thread Francisco Jerez
gl_LastFragData overlaps gl_FragData by definition. --- src/compiler/glsl/linker.cpp | 3 +++ 1 file changed, 3 insertions(+) diff --git a/src/compiler/glsl/linker.cpp b/src/compiler/glsl/linker.cpp index 6d45a02..aeaeb9c 100644 --- a/src/compiler/glsl/linker.cpp +++ b/src/compiler/glsl/linker.cp

[Mesa-dev] [PATCH 14/17] glsl: Don't consider read-only fragment outputs to be written to.

2016-07-20 Thread Francisco Jerez
Since they cannot be written. This prevents adding fragment outputs to the OutputsWritten set that are only read from via the gl_LastFragData array but never written to. --- src/compiler/glsl/ir_set_program_inouts.cpp | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/compile

[Mesa-dev] [PATCH 15/17] glsl: Keep track of the set of fragment outputs read by a GL program.

2016-07-20 Thread Francisco Jerez
This is the set of shader outputs whose initial value is provided to the shader by some external means when the shader is executed, rather than computed by the shader itself. --- src/compiler/glsl/ir_set_program_inouts.cpp | 3 +++ src/mesa/main/mtypes.h | 1 + 2 files changed

[Mesa-dev] [PATCH 11/17] glsl: Don't attempt to do dead varying elimination on gl_LastFragData arrays.

2016-07-20 Thread Francisco Jerez
Apparently this pass can only handle elimination of a single built-in fragment output array, so the presence of gl_LastFragData (which it wouldn't split correctly anyway) could prevent it from splitting the actual gl_FragData array. Just match gl_FragData by name since it's the only built-in it ca

[Mesa-dev] [PATCH 16/17] nir: Pass through fb_fetch_output and OutputsRead from GLSL IR.

2016-07-20 Thread Francisco Jerez
The NIR representation of framebuffer fetch is the same as the GLSL IR's until interface variables are lowered away, at which point it will be translated to load output intrinsics. The GLSL-to-NIR pass just needs to copy the bits over to the NIR program. --- src/compiler/glsl/glsl_to_nir.cpp | 2

Re: [Mesa-dev] [PATCH] main/shaderimage: image unit invalid if texture is incomplete, independently of the level

2016-07-22 Thread Francisco Jerez
Alejandro Piñeiro writes: > Hi, > > On 15/07/16 22:46, Francisco Jerez wrote: >> Alejandro Piñeiro writes: >> >>> On 14/07/16 21:24, Francisco Jerez wrote: >>>> Alejandro Piñeiro writes: >>>> >>>>> Without this commit, a ima

[Mesa-dev] [PATCH 00/21] i965: Implement non-coherent framebuffer fetch.

2016-07-22 Thread Francisco Jerez
This is an implementation of non-coherent framebuffer fetch as described here [1] working on most hardware generations supported by the i965 driver (from Gen5 to Gen8). My plan was to send the coherent framebuffer fetch implementation for SKL+ first since it's actually simpler than the non-coheren

[Mesa-dev] [PATCH 20/21] i965: Implement glBlendBarrier.

2016-07-22 Thread Francisco Jerez
This is a no-op if the platform supports coherent framebuffer fetch, -- If it doesn't we just need to flush the render cache and invalidate the texture cache in order for previous rendering to be visible to framebuffer fetch. --- src/mesa/drivers/dri/i965/brw_program.c | 20 1

[Mesa-dev] [PATCH 15/21] i965: Return the correct layout from get_isl_dim_layout for pre-ILK cube textures.

2016-07-22 Thread Francisco Jerez
--- src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 7 +-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c index 5bf9243..602306b 100644 --- a/src/mesa/drivers/dri/i965/intel_mipmap_tre

[Mesa-dev] [PATCH 08/21] i965: Fix undefined signed overflow in INTEL_MASK for bitfields of 31 bits.

2016-07-22 Thread Francisco Jerez
Most likely we had only ever used this macro on bitfields of less than 31 bits -- That's going to change shortly. --- src/mesa/drivers/dri/i965/brw_defines.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/i965/brw_defines.h b/src/mesa/drivers/dri/i965/brw

[Mesa-dev] [PATCH 02/21] i965/fs: Add brw_wm_prog_key bit specifying whether FB reads should be coherent.

2016-07-22 Thread Francisco Jerez
Some of the following changes in this series are specific to the non-coherent path, so I need some way to tell whether the coherent or non-coherent path is in use. The flag defaults to the value of the gl_extensions::MESA_shader_framebuffer_fetch enable so that it can be overridden easily on hardw

[Mesa-dev] [PATCH 17/21] i965: Massage argument list of brw_emit_surface_state().

2016-07-22 Thread Francisco Jerez
This commit does three different things in a single pass in order to keep the amount of churn low: Remove the for_gather boolean argument which was unused, pass the isl_view argument by value rather than by reference since I'll have to modify it from within the function, and add a target argument t

[Mesa-dev] [PATCH 05/21] i965/fs: Emit interpolation setup if non-coherent framebuffer fetch is in use.

2016-07-22 Thread Francisco Jerez
This will be required for the next commit since the non-coherent path makes use of the fragment coordinates implicitly, so they need to be calculated. --- src/mesa/drivers/dri/i965/brw_fs.cpp | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp

[Mesa-dev] [PATCH 07/21] i965/fs: Special-case nir_intrinsic_store_output for the fragment shader.

2016-07-22 Thread Francisco Jerez
I'm about to change how fragment shader output locations are represented, so the generic nir_intrinsic_store_output implementation that assumes that outputs are just contiguous elements in the big nir_outputs array won't work anymore. This somewhat simplified implementation of nir_intrinsic_store_

[Mesa-dev] [PATCH 09/21] i965/fs: Rework representation of fragment output locations in NIR.

2016-07-22 Thread Francisco Jerez
The problem with the current approach is that driver output locations are represented as a linear offset within the nir_outputs array, which makes it rather difficult for the back-end to figure out what color output and index some nir_intrinsic_load/store_output was meant for, because the offset of

[Mesa-dev] [PATCH 12/21] i965: Return whether the miptree was resolved from intel_miptree_resolve_color().

2016-07-22 Thread Francisco Jerez
This will allow optimizing out the cache flush in some cases when resolving wasn't necessary. --- src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 12 src/mesa/drivers/dri/i965/intel_mipmap_tree.h | 2 +- 2 files changed, 9 insertions(+), 5 deletions(-) diff --git a/src/mesa/drivers/

[Mesa-dev] [PATCH 11/21] i965/fs: Translate nir_intrinsic_load_output on a fragment output.

2016-07-22 Thread Francisco Jerez
This gets the non-coherent framebuffer fetch path hooked up to the NIR front-end. --- src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 19 +++ 1 file changed, 19 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp index 281c704

[Mesa-dev] [PATCH 01/21] i965/fs: Get rid of fs_visitor::do_dual_src.

2016-07-22 Thread Francisco Jerez
This boolean flag was being used for two different things: - To set the brw_wm_prog_data::dual_src_blend flag. Instead we can just set it based on whether the dual_src_output register is valid, which will be the case if the shader writes the secondary blending color. - To decide wheth

[Mesa-dev] [PATCH 06/21] i965/fs: Implement non-coherent framebuffer fetch using the sampler unit.

2016-07-22 Thread Francisco Jerez
--- src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 89 1 file changed, 89 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp index 2872b2d..f5f918d 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp +++ b

[Mesa-dev] [PATCH 21/21] i965: Flip the non-coherent framebuffer fetch extension bit on G45-Gen8 hardware.

2016-07-22 Thread Francisco Jerez
This is not enabled on the original Gen4 part because it lacks surface state tile offsets so it may not be possible to sample from arbitrary non-zero layers of the framebuffer depending on the miptree layout (it should be possible to work around this by allocating a scratch surface and doing the sa

[Mesa-dev] [PATCH 04/21] i965/fs: Force per-sample dispatch if the shader reads from a multisample FBO.

2016-07-22 Thread Francisco Jerez
The result of a framebuffer fetch from a multisample FBO is inherently per-sample, so the spec requires at least those sections of the shader that depend on the framebuffer fetch result to be executed once per sample. --- src/mesa/drivers/dri/i965/brw_fs.cpp | 3 ++- 1 file changed, 2 insertions(+

[Mesa-dev] [PATCH 16/21] i965: Add missing has_surface_tile_offset flag to the Gen8+ device info structures.

2016-07-22 Thread Francisco Jerez
This surface state control has been supported by all hardware generations since G45. --- src/mesa/drivers/dri/i965/brw_device_info.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_device_info.c b/src/mesa/drivers/dri/i965/brw_device_info.c index 77bbe78..4d90a

[Mesa-dev] [PATCH 18/21] i965: Implement support for overriding the texture target in brw_emit_surface_state.

2016-07-22 Thread Francisco Jerez
This allows the caller to bind a miptree using a texture target other than the one it it was created with. The code should work even if the memory layouts of the specified and original targets don't match, as long as the caller only intends to access a single slice of the miptree structure. This

[Mesa-dev] [PATCH 14/21] i965: Factor out isl_surf_dim/isl_dim_layout calculation into functions.

2016-07-22 Thread Francisco Jerez
The logic to calculate the right layout and dimensionality for a given GL texture target is going to be useful elsewhere, factor it out from intel_miptree_get_isl_surf(). --- src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 71 ++- src/mesa/drivers/dri/i965/intel_mipmap_tree

[Mesa-dev] [PATCH 10/21] i965/fs: Allocate fragment output temporaries on demand.

2016-07-22 Thread Francisco Jerez
This gets rid of the duplication of logic between nir_setup_outputs() and get_frag_output() by allocating fragment output temporaries lazily whenever get_frag_output() is called. This makes nir_setup_outputs() a no-op for the fragment shader stage. --- src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 7

[Mesa-dev] [PATCH 03/21] i965: Allocate space in the binding table for non-coherent FB fetch.

2016-07-22 Thread Francisco Jerez
Unfortunately due to the inconsistent meaning of some surface state structure fields, we cannot re-use the same binding table entries for sampling from and rendering into the same set of render buffers, so we need to allocate a separate binding table block specifically for render target reads if th

[Mesa-dev] [PATCH 19/21] i965: Upload surface state for non-coherent framebuffer fetch.

2016-07-22 Thread Francisco Jerez
This iterates over the list of attached render buffers and binds appropriate surface state structures to the binding table block allocated for shader framebuffer read. --- src/mesa/drivers/dri/i965/brw_state.h| 1 + src/mesa/drivers/dri/i965/brw_state_upload.c | 4 ++ src/mesa/dr

[Mesa-dev] [PATCH 13/21] i965: Resolve color for non-coherent FB fetch at UpdateState time.

2016-07-22 Thread Francisco Jerez
This is required because the sampler unit used to fetch from the framebuffer is unable to interpret non-color-compressed fast-cleared single-sample texture data. Roughly the same limitation applies for surfaces bound to texture or image units, but unlike texture sampling, non-coherent framebuffer

Re: [Mesa-dev] [PATCH 02/95] i965/vec4/nir: simplify glsl_type_for_nir_alu_type()

2016-07-25 Thread Francisco Jerez
Iago Toral Quiroga writes: > From: Connor Abbott > > Less duplication, one one less case to handle for doubles and support > for sized NIR types. > > v2: Fix call to get_instance by swapping rows and columns params (Iago) > > Signed-off-by: Iago Toral Quiroga Revi

Re: [Mesa-dev] [PATCH 09/95] i965/vec4: add support for printing DF immediates

2016-07-25 Thread Francisco Jerez
Iago Toral Quiroga writes: > From: Connor Abbott > Reviewed-by: Francisco Jerez > --- > src/mesa/drivers/dri/i965/brw_vec4.cpp | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp > b/src/mesa/drivers/dri/i965/brw_

Re: [Mesa-dev] [PATCH 04/95] i965/vec4/nir: Add bit-size information to types

2016-07-25 Thread Francisco Jerez
Iago Toral Quiroga writes: Reviewed-by: Francisco Jerez > --- > src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 8 > 1 file changed, 4 insertions(+), 4 deletions(-) > > diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp > b/src/mesa/drivers/dri/i965/brw_ve

Re: [Mesa-dev] [PATCH 05/95] i965/vec4/nir: support doubles in ALU operations

2016-07-25 Thread Francisco Jerez
gt; + unsigned dst_bit_size = nir_dest_bit_size(instr->dest.dest); > + dst_type = (nir_alu_type) (dst_type | dst_bit_size); Seems rather confusing to declare two temporaries for this and assign one of them twice, when you could have written the nir_alu_type as a straightforward closed-form express

Re: [Mesa-dev] [PATCH 03/95] i965/vec4/nir: allocate two registers for dvec3/dvec4

2016-07-25 Thread Francisco Jerez
Iago Toral Quiroga writes: > From: Connor Abbott > > --- > src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 5 - > 1 file changed, 4 insertions(+), 1 deletion(-) > > diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp > b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp > index 6662a1e..1f8fa80

Re: [Mesa-dev] [PATCH] main/shaderimage: image unit invalid if texture is incomplete, independently of the level

2016-07-26 Thread Francisco Jerez
Alejandro Piñeiro writes: > On 23/07/16 00:31, Francisco Jerez wrote: >> Alejandro Piñeiro writes: >> >>> Hi, >>> >>> On 15/07/16 22:46, Francisco Jerez wrote: >>>> Alejandro Piñeiro writes: >>>> >>>&g

Re: [Mesa-dev] [PATCH 01/21] i965/fs: Get rid of fs_visitor::do_dual_src.

2016-07-26 Thread Francisco Jerez
Anuj Phogat writes: > On Fri, Jul 22, 2016 at 8:58 PM, Francisco Jerez > wrote: >> This boolean flag was being used for two different things: >> >> - To set the brw_wm_prog_data::dual_src_blend flag. Instead we can >>just set it based on whether the dua

Re: [Mesa-dev] [PATCH 10/17] glsl: Define a gl_LastFragData built-in for GLSL versions that have gl_FragData.

2016-07-27 Thread Francisco Jerez
Kenneth Graunke writes: > On Wednesday, July 20, 2016 9:49:40 PM PDT Francisco Jerez wrote: >> The EXT_shader_framebuffer_fetch extension defines alternative >> language for GLES2 shaders where user-defined fragment outputs are not >> allowed. Instead of using inout

Re: [Mesa-dev] [Patch v2] clover: make GCC 4.8 happy

2016-07-27 Thread Francisco Jerez
Dieter Nützel writes: > Can someone of you commit for me after review, please. Reviewed-by and pushed, thanks. > > Thanks, >Dieter > > Am 28.07.2016 00:20, schrieb Dieter Nützel: >> Without this GCC 4.8.x throws below error: >> >> error: invalid initialization of non-const reference of typ

Re: [Mesa-dev] [PATCH 02/12] glsl: Replace big pile of hand-written code with a generator

2016-07-27 Thread Francisco Jerez
Ilia Mirkin writes: > Another alternative would be to make these into IR nodes. This could > be done by groups, e.g. adding a special "atomic" ir type with > subtypes, or .. whatever else. Just pointing out that this fun with > strings isn't totally necessary. Not sure why it was done that way in

Re: [Mesa-dev] [PATCH 10/17] glsl: Define a gl_LastFragData built-in for GLSL versions that have gl_FragData.

2016-07-27 Thread Francisco Jerez
Kenneth Graunke writes: > On Wednesday, July 27, 2016 5:05:39 PM PDT Francisco Jerez wrote: >> Kenneth Graunke writes: >> >> > On Wednesday, July 20, 2016 9:49:40 PM PDT Francisco Jerez wrote: >> >> The EXT_shader_framebuffer_fetch extension defines alternat

Re: [Mesa-dev] [PATCH 1/2] clover: assert struct argument is compiled usably

2016-07-27 Thread Francisco Jerez
Emil Velikov writes: > On 6 June 2016 at 00:02, Vedran Miletić wrote: >> On 06/04/2016 04:18 AM, Francisco Jerez wrote: >>> >>> Serge Martin writes: >>> >>>> From: Vedran Miletić >>>> >>>> Make sure that a struct argumen

[Mesa-dev] [PATCH 9/9] i965: Expose shader framebuffer fetch extensions on Gen9+.

2016-07-28 Thread Francisco Jerez
--- src/mesa/drivers/dri/i965/intel_extensions.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/i965/intel_extensions.c b/src/mesa/drivers/dri/i965/intel_extensions.c index 12bf454..33114db 100644 --- a/src/mesa/drivers/dri/i965/intel_extensions.c +++ b

[Mesa-dev] [PATCH 6/9] i965/fs: Don't CSE render target messages with different target index.

2016-07-28 Thread Francisco Jerez
We weren't checking the fs_inst::target field when comparing whether two instructions are equal. For FB writes it doesn't matter because they aren't CSE-able anyway, but this would have become a problem with FB reads which are expression-like instructions. --- src/mesa/drivers/dri/i965/brw_fs_cse

[Mesa-dev] [PATCH 2/9] i965/eu: Add codegen support for the Gen9+ render target read message.

2016-07-28 Thread Francisco Jerez
--- src/mesa/drivers/dri/i965/brw_defines.h | 4 src/mesa/drivers/dri/i965/brw_eu.h | 8 src/mesa/drivers/dri/i965/brw_eu_emit.c | 28 3 files changed, 40 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_defines.h b/src/mesa/drivers/dri/i

[Mesa-dev] [PATCH 4/9] i965/fs: Define framebuffer read virtual opcode.

2016-07-28 Thread Francisco Jerez
--- src/mesa/drivers/dri/i965/brw_defines.h| 3 +++ src/mesa/drivers/dri/i965/brw_fs.cpp | 2 ++ src/mesa/drivers/dri/i965/brw_fs.h | 2 ++ src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 20 src/mesa/drivers/dri/i965/brw_shader.cpp | 2

[Mesa-dev] [PATCH 3/9] i965/disasm: Fix RC message type strings on Gen7+.

2016-07-28 Thread Francisco Jerez
--- src/mesa/drivers/dri/i965/brw_disasm.c | 28 +--- 1 file changed, 25 insertions(+), 3 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_disasm.c b/src/mesa/drivers/dri/i965/brw_disasm.c index d74d5d5..cca4c8b 100644 --- a/src/mesa/drivers/dri/i965/brw_disasm.c +

[Mesa-dev] [PATCH 7/9] i965/fs: Remove special casing of framebuffer writes in scheduler code.

2016-07-28 Thread Francisco Jerez
The reason why it was safe for the scheduler to ignore the side effects of framebuffer write instructions was that its side effects couldn't have had any influence on any other instruction in the program, because we weren't doing framebuffer reads, and framebuffer writes were always non-overlapping

[Mesa-dev] [PATCH 8/9] i965/fs: Hook up coherent framebuffer reads to the NIR front-end.

2016-07-28 Thread Francisco Jerez
--- src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 22 -- 1 file changed, 20 insertions(+), 2 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp index e3215da..85e111d 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp

[Mesa-dev] [PATCH 5/9] i965/fs: Define logical framebuffer read opcode and lower it to physical reads.

2016-07-28 Thread Francisco Jerez
--- src/mesa/drivers/dri/i965/brw_defines.h | 1 + src/mesa/drivers/dri/i965/brw_fs.cpp | 24 src/mesa/drivers/dri/i965/brw_fs_cse.cpp | 1 + src/mesa/drivers/dri/i965/brw_shader.cpp | 2 ++ 4 files changed, 28 insertions(+) diff --git a/src/mesa/drivers/dri/i965/

[Mesa-dev] [PATCH 0/9] i965: Implement coherent framebuffer fetch on Gen9+.

2016-07-28 Thread Francisco Jerez
This series gets coherent render target reads working with the i965 driver and exposes the EXT_shader_framebuffer_fetch extension on Gen9+ platforms. It's dependent on the series I sent last week to make the driver-independent changes to enable framebuffer fetch [1], and the series to enable non-c

[Mesa-dev] [PATCH 1/9] i965/eu: Take into account the target cache argument in brw_set_dp_read_message.

2016-07-28 Thread Francisco Jerez
brw_set_dp_read_message() was setting the data cache as send message SFID on Gen7+ hardware, ignoring the target cache specified by the caller. Some of the callers were passing a bogus target cache value as argument relying on brw_set_dp_read_message not to take it into account. Fix them too. ---

Re: [Mesa-dev] [PATCH 12/17] glsl/ast: Allow redeclaration of gl_LastFragData with different precision qualifier.

2016-07-28 Thread Francisco Jerez
Kenneth Graunke writes: > On Wednesday, July 20, 2016 9:49:42 PM PDT Francisco Jerez wrote: >> --- >> src/compiler/glsl/ast_to_hir.cpp | 13 + >> 1 file changed, 13 insertions(+) >> >> diff --git a/src/compiler/glsl/ast_to_hir.cpp >> b/sr

Re: [Mesa-dev] [PATCH 10/95] i965/vec4: handle 32 and 64 bit channels in liveness analysis

2016-07-29 Thread Francisco Jerez
Iago Toral Quiroga writes: > From: "Juan A. Suarez Romero" > > Our current data flow analysis does not take into account that channels > on 64-bit operands are 64-bit. This is a problem when the same register > is accessed using both 64-bit and 32-bit channels. This is very common > in operation

Re: [Mesa-dev] [PATCH 10/17] glsl: Define a gl_LastFragData built-in for GLSL versions that have gl_FragData.

2016-07-29 Thread Francisco Jerez
Francisco Jerez writes: > Kenneth Graunke writes: > >> On Wednesday, July 27, 2016 5:05:39 PM PDT Francisco Jerez wrote: >>> Kenneth Graunke writes: >>> >>> > On Wednesday, July 20, 2016 9:49:40 PM PDT Francisco Jerez wrote: >>> &g

Re: [Mesa-dev] [PATCH 09/21] i965/fs: Rework representation of fragment output locations in NIR.

2016-07-31 Thread Francisco Jerez
Kenneth Graunke writes: > On Friday, July 22, 2016 8:59:03 PM PDT Francisco Jerez wrote: >> The problem with the current approach is that driver output locations >> are represented as a linear offset within the nir_outputs array, which >> makes it rather difficult for the b

Re: [Mesa-dev] [PATCH 17/95] i965/vec4: add dst_null_df()

2016-08-02 Thread Francisco Jerez
Reviewed-by: Francisco Jerez Iago Toral Quiroga writes: > --- > src/mesa/drivers/dri/i965/brw_vec4.h | 5 + > 1 file changed, 5 insertions(+) > > diff --git a/src/mesa/drivers/dri/i965/brw_vec4.h > b/src/mesa/drivers/dri/i965/brw_vec4.h > index 3043147..afcf31e 10

Re: [Mesa-dev] [PATCH 16/95] i965/vec4: We only support 32-bit integer ALU operations for now

2016-08-02 Thread Francisco Jerez
Iago Toral Quiroga writes: > Add asserts so we remember to address this when we enable 64-bit > integer support, as suggested by Connor and Jason. Reviewed-by: Francisco Jerez > --- > src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 70 > ++ >

Re: [Mesa-dev] [PATCH 15/95] i965/disasm: align16 DF source regions have a width of 2

2016-08-02 Thread Francisco Jerez
Reviewed-by: Francisco Jerez Iago Toral Quiroga writes: > --- > src/mesa/drivers/dri/i965/brw_disasm.c | 5 - > 1 file changed, 4 insertions(+), 1 deletion(-) > > diff --git a/src/mesa/drivers/dri/i965/brw_disasm.c > b/src/mesa/drivers/dri/i965/brw_disasm.c > i

Re: [Mesa-dev] [PATCH 13/95] i965: add brw_vecn_grf()

2016-08-02 Thread Francisco Jerez
Iago Toral Quiroga writes: > From: Connor Abbott > Reviewed-by: Francisco Jerez > --- > src/mesa/drivers/dri/i965/brw_reg.h | 6 ++ > 1 file changed, 6 insertions(+) > > diff --git a/src/mesa/drivers/dri/i965/brw_reg.h > b/src/mesa/drivers/dri/i965/brw_reg.h

Re: [Mesa-dev] [PATCH 09/95] i965/vec4: add support for printing DF immediates

2016-08-02 Thread Francisco Jerez
Iago Toral Quiroga writes: > From: Connor Abbott > Reviewed-by: Francisco Jerez > --- > src/mesa/drivers/dri/i965/brw_vec4.cpp | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp > b/src/mesa/drivers/dri/i965/brw_

Re: [Mesa-dev] [PATCH 28/95] i965/vec4: fix register allocation for 64-bit undef sources

2016-08-02 Thread Francisco Jerez
sa_values[instr->def.index] = dst_reg(VGRF, alloc.allocate(1)); > + nir_ssa_values[instr->def.index] = > + dst_reg(VGRF, alloc.allocate(instr->def.bit_size / 32)); I think you want to use DIV_ROUND_UP here instead, with that fixed: R

Re: [Mesa-dev] [PATCH 33/95] i965/vec4: implement d2b

2016-08-02 Thread Francisco Jerez
Iago Toral Quiroga writes: > --- > src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 18 ++ > 1 file changed, 18 insertions(+) > > diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp > b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp > index 1525a3d..4014020 100644 > --- a/src/mesa/d

Re: [Mesa-dev] [PATCH 25/95] i965/vec4: fix base offset for nir_registers with doubles

2016-08-02 Thread Francisco Jerez
Iago Toral Quiroga writes: > --- > src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 8 +--- > 1 file changed, 5 insertions(+), 3 deletions(-) > > diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp > b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp > index cf35f2e..fde7b60 100644 > --- a/src/me

Re: [Mesa-dev] [PATCH 36/95] i965/vec4: add a helper function to create double immediates

2016-08-03 Thread Francisco Jerez
Iago Toral Quiroga writes: > Gen7 hardware does not support double immediates so these need > to be moved in 32-bit chunks to a regular vgrf instead. Instead > of doing this every time we need to create a DF immediate, > create a helper function that does the right thing depending > on the hardwa

Re: [Mesa-dev] [PATCH 40/95] i965/vec4: add a SIMD lowering pass

2016-08-03 Thread Francisco Jerez
Iago Toral Quiroga writes: > Generally, instructions in Align16 mode only ever write to a single > register and don't need anny form of SIMD splitting, that's why we > have never had a SIMD splitting pass in the vec4 backend. However, > double-precision instructions typically write 2 registers an

<    5   6   7   8   9   10   11   12   13   14   >