From: Serge Martin
Return an API object from an intrusive reference to a Clover object,
incrementing the reference count of the object.
Reviewed-by: Francisco Jerez
---
src/gallium/state_trackers/clover/api/util.hpp | 11 +++
1 file changed, 11 insertions(+)
diff --git a/src/gallium
Reviewed-by: Serge Martin
---
.../state_trackers/clover/llvm/invocation.cpp | 34 ++
1 file changed, 3 insertions(+), 31 deletions(-)
diff --git a/src/gallium/state_trackers/clover/llvm/invocation.cpp
b/src/gallium/state_trackers/clover/llvm/invocation.cpp
index afe621d
Superseded by compile_program() and link_program().
Reviewed-by: Serge Martin
---
src/gallium/state_trackers/clover/core/compiler.hpp | 7 ---
src/gallium/state_trackers/clover/llvm/invocation.cpp | 11 ---
2 files changed, 18 deletions(-)
diff --git a/src/gallium/state_trackers/
Jan Vesely writes:
> On Sun, 2016-07-03 at 17:51 -0700, Francisco Jerez wrote:
>> Reviewed-by: Serge Martin
>> ---
>> .../state_trackers/clover/llvm/invocation.cpp | 223
>> ++---
>> 1 file changed, 103 insertions(+), 120 deletions(-
Jan Vesely writes:
> On Sun, 2016-07-03 at 17:51 -0700, Francisco Jerez wrote:
>> Some assorted and mostly trivial clean-ups for the source to bitcode
>> compilation path.
>>
>> Reviewed-by: Serge Martin
>> ---
>> .../state_tracke
group with nibble granularity at best
it's unpractical to split instructions into chunks of execution size
less than four. SIMD4 though definitely makes sense because of FP64.
Either way patch is:
Reviewed-by: Francisco Jerez
>assert(inst->force_writemask_all || inst->
Samuel Iglesias Gonsálvez writes:
> From: Iago Toral Quiroga
>
> So that we can have gen7 split large writes produced by the pack lowering.
Reviewed-by: Francisco Jerez
> ---
> src/mesa/drivers/dri/i965/brw_fs.cpp | 10 +-
> 1 file changed, 5 insertions(+), 5 del
Samuel Iglesias Gonsálvez writes:
> From: Iago Toral Quiroga
>
> Gen7 hardware does not support double immediates so these need
> to be moved in 32-bit chunks to a regular vgrf instead. Instead
> of doing this every time we need to create a DF immediate,
> create a helper function that does the
Samuel Iglesias Gonsálvez writes:
> From: Iago Toral Quiroga
>
> In fp64 we can produce code like this:
>
> mov(16) vgrf2<2>:UD, vgrf3<2>:UD
>
> That our simd lowering pass would typically split in instructions with a
> width of 8, writing to two consecutive registers each. Unfortunately, gen7
>
Samuel Iglesias Gonsálvez writes:
> This is not allowed by the HW and copy propagation can hide this issue to
> lower_simd_width pass, which is going to fix it.
>
> Signed-off-by: Samuel Iglesias Gonsálvez
> ---
> src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp | 1 +
> 1 file changed, 1
Iago Toral writes:
> On Thu, 2016-07-07 at 19:36 -0700, Francisco Jerez wrote:
>> Samuel Iglesias Gonsálvez writes:
>>
>> >
>> > From: Iago Toral Quiroga
>> >
>> > In fp64 we can produce code like this:
>> >
>> > mov(16)
Jan Vesely writes:
> On Sun, 2016-07-03 at 17:51 -0700, Francisco Jerez wrote:
>> This gets rid of most ifdef's from the invocation.cpp code -- Only a
>> couple of them are left which will be removed differently in the
>> following commits.
>>
>> Rev
Jan Vesely writes:
> On Mon, 2016-07-04 at 12:31 -0700, Francisco Jerez wrote:
>> Jan Vesely writes:
>>
>> > On Sun, 2016-07-03 at 17:51 -0700, Francisco Jerez wrote:
>> > > Reviewed-by: Serge Martin
>> > > ---
>> > &g
Samuel Iglesias Gonsálvez writes:
> From: Iago Toral Quiroga
>
> In fp64 we can produce code like this:
>
> mov(16) vgrf2<2>:UD, vgrf3<2>:UD
>
> That our simd lowering pass would typically split in instructions with a
> width of 8, writing to two consecutive registers each. Unfortunately, gen7
>
Samuel Iglesias Gonsálvez writes:
> So that we can have gen7 split large writes produced by this lowering pass.
>
> Signed-off-by: Samuel Iglesias Gonsálvez
Reviewed-by: Francisco Jerez
> ---
> src/mesa/drivers/dri/i965/brw_fs.cpp | 10 +-
> 1 file changed,
Francisco Jerez writes:
> Samuel Iglesias Gonsálvez writes:
>
>> From: Iago Toral Quiroga
>>
>> In fp64 we can produce code like this:
>>
>> mov(16) vgrf2<2>:UD, vgrf3<2>:UD
>>
>> That our simd lowering pass would typically split in i
t(exec_type_size);
> +
> + /* The hardware shifts exactly 8 channels per compressed half of the
> + * instruction in single-precision mode and exactly 4 in
> double-precision.
> + */
> + if (channels_per_grf != (exec_type_size == 8 ? 4 : 8))
> +
Vedran Miletić writes:
> 06.06.2016 u 12:24, Vedran Miletić je napisao/la:
>> On 06/06/2016 02:04 AM, Francisco Jerez wrote:
>>> Vedran Miletić writes:
>>>>
>>>> Aside from working just like NVIDIA and AMD proprietary stacks, no.
>>>>
>&
Alejandro Piñeiro writes:
> Without this commit, a image is considered valid if the level of the
> texture bound to the image is complete, something we can check as mesa
> save independently if it is "base incomplete" of "mipmap incomplete".
>
> But, from the OpenGL 4.3 Core Specification, sectio
Vedran Miletić writes:
> On 07/13/2016 12:49 AM, Francisco Jerez wrote:
>> You can just replace the current implementation of tokenize(), it's not
>> used for anything else other than splitting compiler arguments AFAIK.
>>
>
> Done, patch incoming.
>
>>&
Alejandro Piñeiro writes:
> On 14/07/16 21:24, Francisco Jerez wrote:
>> Alejandro Piñeiro writes:
>>
>>> Without this commit, a image is considered valid if the level of the
>>> texture bound to the image is complete, something we can check as mesa
>
eaks the build because the member
> names of this class are replaced by the literal 1.
Reviewed-by: Francisco Jerez
> ---
> .../state_trackers/clover/llvm/invocation.cpp | 24
> +++---
> 1 file changed, 17 insertions(+), 7 deletions(-)
>
> diff
te_trackers/clover/llvm/invocation.cpp
> +++ b/src/gallium/state_trackers/clover/llvm/invocation.cpp
> @@ -39,6 +39,8 @@
> #include
>
> #include
> +#include
> +
Redundant whitespace, with that removed:
Reviewed-by: Francisco Jerez
> #include
This series implements the driver-independent part of the
EXT_shader_framebuffer_fetch extension that provides fully
programmable blending to GLES shaders. The GLSL IR and NIR
representation of the framebuffer fetch functionality should be
straightforward, but the way extension tracking works may
---
src/mesa/main/barrier.c | 51 +
src/mesa/main/barrier.h | 6 ++
src/mesa/main/shaderimage.c | 51 -
src/mesa/main/shaderimage.h | 6 --
4 files changed, 57 insertions(+), 57 deletions(-)
Both MESA_shader_framebuffer_fetch_non_coherent and the non-coherent
variant of KHR_blend_equation_advanced will use this driver hook to
request coherency between framebuffer reads and writes. This
intentionally doesn't hook up glBlendBarrierMESA to the dispatch layer
since the extension isn't exp
The EXT_shader_framebuffer_fetch extension defines alternative
language for GLES2 shaders where user-defined fragment outputs are not
allowed. Instead of using inout user-defined fragment outputs the
shader is expected to read from the gl_LastFragData built-in array.
In addition this allows using
According to the EXT_shader_framebuffer_fetch extension the inout
qualifier can be used on ESSL 3.0+ shaders to declare a special kind
of fragment output that gets implicitly initialized with the previous
framebuffer contents at the current fragment coordinates. In addition
we allow using the same
In preparation for collecting all pipeline barrier GL entry points
into a single source file.
---
src/mapi/glapi/gen/gl_genexec.py | 2 +-
src/mesa/Makefile.sources | 4 ++--
src/mesa/drivers/common/driverfuncs.c | 4 ++--
src/mesa/main/{texturebarrier.c
The GLSL IR representation of framebuffer fetch amounts to a single
bit in the ir_variable object applicable to fragment shader outputs.
The flag indicates that the variable will be implicitly initialized to
the previous contents of the render buffer at the same fragment
coordinates and sample inde
This allows drivers to expose EXT_shader_framebuffer_fetch in GLES2+
contexts if desired. Note that this adds boolean flags for two MESA
extensions, but only the EXT GLES-only extension is exposed for the
moment, see the cover letter of this series [1] for the rationale.
[1] https://lists.freedes
This can currently only give true as result since the only way you can
expose EXT_shader_framebuffer_fetch right now is by flipping the
MESA_shader_framebuffer_fetch bit, but that could potentially change
in the future, see [1] for an explanation.
[1] https://lists.freedesktop.org/archives/mesa-de
---
src/compiler/glsl/glsl_parser_extras.cpp | 3 +++
src/compiler/glsl/glsl_parser_extras.h | 13 +
2 files changed, 16 insertions(+)
diff --git a/src/compiler/glsl/glsl_parser_extras.cpp
b/src/compiler/glsl/glsl_parser_extras.cpp
index 0077434..a2c1afe 100644
--- a/src/compiler/
---
src/mapi/glapi/gen/es_EXT.xml | 5 +
1 file changed, 5 insertions(+)
diff --git a/src/mapi/glapi/gen/es_EXT.xml b/src/mapi/glapi/gen/es_EXT.xml
index 6886dab..929e0e7 100644
--- a/src/mapi/glapi/gen/es_EXT.xml
+++ b/src/mapi/glapi/gen/es_EXT.xml
@@ -790,6 +790,11 @@
+
+
+
+
---
src/compiler/glsl/ast_to_hir.cpp | 13 +
1 file changed, 13 insertions(+)
diff --git a/src/compiler/glsl/ast_to_hir.cpp b/src/compiler/glsl/ast_to_hir.cpp
index c050a3f..ac651a9 100644
--- a/src/compiler/glsl/ast_to_hir.cpp
+++ b/src/compiler/glsl/ast_to_hir.cpp
@@ -3948,6 +3948,1
This requires emitting a series of copies at the top of the program
from each output variable to the corresponding temporary. The initial
copy can be skipped for non-framebuffer fetch outputs whose initial
value is undefined, and the final copy needs to be skipped for
read-only outputs (i.e. gl_La
gl_LastFragData overlaps gl_FragData by definition.
---
src/compiler/glsl/linker.cpp | 3 +++
1 file changed, 3 insertions(+)
diff --git a/src/compiler/glsl/linker.cpp b/src/compiler/glsl/linker.cpp
index 6d45a02..aeaeb9c 100644
--- a/src/compiler/glsl/linker.cpp
+++ b/src/compiler/glsl/linker.cp
Since they cannot be written. This prevents adding fragment outputs
to the OutputsWritten set that are only read from via the
gl_LastFragData array but never written to.
---
src/compiler/glsl/ir_set_program_inouts.cpp | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/src/compile
This is the set of shader outputs whose initial value is provided to
the shader by some external means when the shader is executed, rather
than computed by the shader itself.
---
src/compiler/glsl/ir_set_program_inouts.cpp | 3 +++
src/mesa/main/mtypes.h | 1 +
2 files changed
Apparently this pass can only handle elimination of a single built-in
fragment output array, so the presence of gl_LastFragData (which it
wouldn't split correctly anyway) could prevent it from splitting the
actual gl_FragData array. Just match gl_FragData by name since it's
the only built-in it ca
The NIR representation of framebuffer fetch is the same as the GLSL
IR's until interface variables are lowered away, at which point it
will be translated to load output intrinsics. The GLSL-to-NIR pass
just needs to copy the bits over to the NIR program.
---
src/compiler/glsl/glsl_to_nir.cpp | 2
Alejandro Piñeiro writes:
> Hi,
>
> On 15/07/16 22:46, Francisco Jerez wrote:
>> Alejandro Piñeiro writes:
>>
>>> On 14/07/16 21:24, Francisco Jerez wrote:
>>>> Alejandro Piñeiro writes:
>>>>
>>>>> Without this commit, a ima
This is an implementation of non-coherent framebuffer fetch as
described here [1] working on most hardware generations supported
by the i965 driver (from Gen5 to Gen8). My plan was to send the
coherent framebuffer fetch implementation for SKL+ first since
it's actually simpler than the non-coheren
This is a no-op if the platform supports coherent framebuffer fetch,
-- If it doesn't we just need to flush the render cache and invalidate
the texture cache in order for previous rendering to be visible to
framebuffer fetch.
---
src/mesa/drivers/dri/i965/brw_program.c | 20
1
---
src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 7 +--
1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
index 5bf9243..602306b 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tre
Most likely we had only ever used this macro on bitfields of less than
31 bits -- That's going to change shortly.
---
src/mesa/drivers/dri/i965/brw_defines.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/src/mesa/drivers/dri/i965/brw_defines.h
b/src/mesa/drivers/dri/i965/brw
Some of the following changes in this series are specific to the
non-coherent path, so I need some way to tell whether the coherent or
non-coherent path is in use. The flag defaults to the value of the
gl_extensions::MESA_shader_framebuffer_fetch enable so that it can be
overridden easily on hardw
This commit does three different things in a single pass in order to
keep the amount of churn low: Remove the for_gather boolean argument
which was unused, pass the isl_view argument by value rather than by
reference since I'll have to modify it from within the function, and
add a target argument t
This will be required for the next commit since the non-coherent path
makes use of the fragment coordinates implicitly, so they need to be
calculated.
---
src/mesa/drivers/dri/i965/brw_fs.cpp | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp
I'm about to change how fragment shader output locations are
represented, so the generic nir_intrinsic_store_output implementation
that assumes that outputs are just contiguous elements in the big
nir_outputs array won't work anymore. This somewhat simplified
implementation of nir_intrinsic_store_
The problem with the current approach is that driver output locations
are represented as a linear offset within the nir_outputs array, which
makes it rather difficult for the back-end to figure out what color
output and index some nir_intrinsic_load/store_output was meant for,
because the offset of
This will allow optimizing out the cache flush in some cases when
resolving wasn't necessary.
---
src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 12
src/mesa/drivers/dri/i965/intel_mipmap_tree.h | 2 +-
2 files changed, 9 insertions(+), 5 deletions(-)
diff --git a/src/mesa/drivers/
This gets the non-coherent framebuffer fetch path hooked up to the NIR
front-end.
---
src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 19 +++
1 file changed, 19 insertions(+)
diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
index 281c704
This boolean flag was being used for two different things:
- To set the brw_wm_prog_data::dual_src_blend flag. Instead we can
just set it based on whether the dual_src_output register is valid,
which will be the case if the shader writes the secondary blending
color.
- To decide wheth
---
src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 89
1 file changed, 89 insertions(+)
diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
index 2872b2d..f5f918d 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
+++ b
This is not enabled on the original Gen4 part because it lacks surface
state tile offsets so it may not be possible to sample from arbitrary
non-zero layers of the framebuffer depending on the miptree layout (it
should be possible to work around this by allocating a scratch surface
and doing the sa
The result of a framebuffer fetch from a multisample FBO is inherently
per-sample, so the spec requires at least those sections of the shader
that depend on the framebuffer fetch result to be executed once per
sample.
---
src/mesa/drivers/dri/i965/brw_fs.cpp | 3 ++-
1 file changed, 2 insertions(+
This surface state control has been supported by all hardware
generations since G45.
---
src/mesa/drivers/dri/i965/brw_device_info.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/src/mesa/drivers/dri/i965/brw_device_info.c
b/src/mesa/drivers/dri/i965/brw_device_info.c
index 77bbe78..4d90a
This allows the caller to bind a miptree using a texture target other
than the one it it was created with. The code should work even if the
memory layouts of the specified and original targets don't match, as
long as the caller only intends to access a single slice of the
miptree structure.
This
The logic to calculate the right layout and dimensionality for a given
GL texture target is going to be useful elsewhere, factor it out from
intel_miptree_get_isl_surf().
---
src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 71 ++-
src/mesa/drivers/dri/i965/intel_mipmap_tree
This gets rid of the duplication of logic between nir_setup_outputs()
and get_frag_output() by allocating fragment output temporaries lazily
whenever get_frag_output() is called. This makes nir_setup_outputs()
a no-op for the fragment shader stage.
---
src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 7
Unfortunately due to the inconsistent meaning of some surface state
structure fields, we cannot re-use the same binding table entries for
sampling from and rendering into the same set of render buffers, so we
need to allocate a separate binding table block specifically for
render target reads if th
This iterates over the list of attached render buffers and binds
appropriate surface state structures to the binding table block
allocated for shader framebuffer read.
---
src/mesa/drivers/dri/i965/brw_state.h| 1 +
src/mesa/drivers/dri/i965/brw_state_upload.c | 4 ++
src/mesa/dr
This is required because the sampler unit used to fetch from the
framebuffer is unable to interpret non-color-compressed fast-cleared
single-sample texture data. Roughly the same limitation applies for
surfaces bound to texture or image units, but unlike texture sampling,
non-coherent framebuffer
Iago Toral Quiroga writes:
> From: Connor Abbott
>
> Less duplication, one one less case to handle for doubles and support
> for sized NIR types.
>
> v2: Fix call to get_instance by swapping rows and columns params (Iago)
>
> Signed-off-by: Iago Toral Quiroga
Revi
Iago Toral Quiroga writes:
> From: Connor Abbott
>
Reviewed-by: Francisco Jerez
> ---
> src/mesa/drivers/dri/i965/brw_vec4.cpp | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp
> b/src/mesa/drivers/dri/i965/brw_
Iago Toral Quiroga writes:
Reviewed-by: Francisco Jerez
> ---
> src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 8
> 1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
> b/src/mesa/drivers/dri/i965/brw_ve
gt; + unsigned dst_bit_size = nir_dest_bit_size(instr->dest.dest);
> + dst_type = (nir_alu_type) (dst_type | dst_bit_size);
Seems rather confusing to declare two temporaries for this and assign
one of them twice, when you could have written the nir_alu_type as a
straightforward closed-form express
Iago Toral Quiroga writes:
> From: Connor Abbott
>
> ---
> src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 5 -
> 1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
> b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
> index 6662a1e..1f8fa80
Alejandro Piñeiro writes:
> On 23/07/16 00:31, Francisco Jerez wrote:
>> Alejandro Piñeiro writes:
>>
>>> Hi,
>>>
>>> On 15/07/16 22:46, Francisco Jerez wrote:
>>>> Alejandro Piñeiro writes:
>>>>
>>>&g
Anuj Phogat writes:
> On Fri, Jul 22, 2016 at 8:58 PM, Francisco Jerez
> wrote:
>> This boolean flag was being used for two different things:
>>
>> - To set the brw_wm_prog_data::dual_src_blend flag. Instead we can
>>just set it based on whether the dua
Kenneth Graunke writes:
> On Wednesday, July 20, 2016 9:49:40 PM PDT Francisco Jerez wrote:
>> The EXT_shader_framebuffer_fetch extension defines alternative
>> language for GLES2 shaders where user-defined fragment outputs are not
>> allowed. Instead of using inout
Dieter Nützel writes:
> Can someone of you commit for me after review, please.
Reviewed-by and pushed, thanks.
>
> Thanks,
>Dieter
>
> Am 28.07.2016 00:20, schrieb Dieter Nützel:
>> Without this GCC 4.8.x throws below error:
>>
>> error: invalid initialization of non-const reference of typ
Ilia Mirkin writes:
> Another alternative would be to make these into IR nodes. This could
> be done by groups, e.g. adding a special "atomic" ir type with
> subtypes, or .. whatever else. Just pointing out that this fun with
> strings isn't totally necessary. Not sure why it was done that way in
Kenneth Graunke writes:
> On Wednesday, July 27, 2016 5:05:39 PM PDT Francisco Jerez wrote:
>> Kenneth Graunke writes:
>>
>> > On Wednesday, July 20, 2016 9:49:40 PM PDT Francisco Jerez wrote:
>> >> The EXT_shader_framebuffer_fetch extension defines alternat
Emil Velikov writes:
> On 6 June 2016 at 00:02, Vedran Miletić wrote:
>> On 06/04/2016 04:18 AM, Francisco Jerez wrote:
>>>
>>> Serge Martin writes:
>>>
>>>> From: Vedran Miletić
>>>>
>>>> Make sure that a struct argumen
---
src/mesa/drivers/dri/i965/intel_extensions.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/src/mesa/drivers/dri/i965/intel_extensions.c
b/src/mesa/drivers/dri/i965/intel_extensions.c
index 12bf454..33114db 100644
--- a/src/mesa/drivers/dri/i965/intel_extensions.c
+++ b
We weren't checking the fs_inst::target field when comparing whether
two instructions are equal. For FB writes it doesn't matter because
they aren't CSE-able anyway, but this would have become a problem with
FB reads which are expression-like instructions.
---
src/mesa/drivers/dri/i965/brw_fs_cse
---
src/mesa/drivers/dri/i965/brw_defines.h | 4
src/mesa/drivers/dri/i965/brw_eu.h | 8
src/mesa/drivers/dri/i965/brw_eu_emit.c | 28
3 files changed, 40 insertions(+)
diff --git a/src/mesa/drivers/dri/i965/brw_defines.h
b/src/mesa/drivers/dri/i
---
src/mesa/drivers/dri/i965/brw_defines.h| 3 +++
src/mesa/drivers/dri/i965/brw_fs.cpp | 2 ++
src/mesa/drivers/dri/i965/brw_fs.h | 2 ++
src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 20
src/mesa/drivers/dri/i965/brw_shader.cpp | 2
---
src/mesa/drivers/dri/i965/brw_disasm.c | 28 +---
1 file changed, 25 insertions(+), 3 deletions(-)
diff --git a/src/mesa/drivers/dri/i965/brw_disasm.c
b/src/mesa/drivers/dri/i965/brw_disasm.c
index d74d5d5..cca4c8b 100644
--- a/src/mesa/drivers/dri/i965/brw_disasm.c
+
The reason why it was safe for the scheduler to ignore the side
effects of framebuffer write instructions was that its side effects
couldn't have had any influence on any other instruction in the
program, because we weren't doing framebuffer reads, and framebuffer
writes were always non-overlapping
---
src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 22 --
1 file changed, 20 insertions(+), 2 deletions(-)
diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
index e3215da..85e111d 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
---
src/mesa/drivers/dri/i965/brw_defines.h | 1 +
src/mesa/drivers/dri/i965/brw_fs.cpp | 24
src/mesa/drivers/dri/i965/brw_fs_cse.cpp | 1 +
src/mesa/drivers/dri/i965/brw_shader.cpp | 2 ++
4 files changed, 28 insertions(+)
diff --git a/src/mesa/drivers/dri/i965/
This series gets coherent render target reads working with the i965
driver and exposes the EXT_shader_framebuffer_fetch extension on Gen9+
platforms. It's dependent on the series I sent last week to make the
driver-independent changes to enable framebuffer fetch [1], and the
series to enable non-c
brw_set_dp_read_message() was setting the data cache as send message
SFID on Gen7+ hardware, ignoring the target cache specified by the
caller. Some of the callers were passing a bogus target cache value
as argument relying on brw_set_dp_read_message not to take it into
account. Fix them too.
---
Kenneth Graunke writes:
> On Wednesday, July 20, 2016 9:49:42 PM PDT Francisco Jerez wrote:
>> ---
>> src/compiler/glsl/ast_to_hir.cpp | 13 +
>> 1 file changed, 13 insertions(+)
>>
>> diff --git a/src/compiler/glsl/ast_to_hir.cpp
>> b/sr
Iago Toral Quiroga writes:
> From: "Juan A. Suarez Romero"
>
> Our current data flow analysis does not take into account that channels
> on 64-bit operands are 64-bit. This is a problem when the same register
> is accessed using both 64-bit and 32-bit channels. This is very common
> in operation
Francisco Jerez writes:
> Kenneth Graunke writes:
>
>> On Wednesday, July 27, 2016 5:05:39 PM PDT Francisco Jerez wrote:
>>> Kenneth Graunke writes:
>>>
>>> > On Wednesday, July 20, 2016 9:49:40 PM PDT Francisco Jerez wrote:
>>> &g
Kenneth Graunke writes:
> On Friday, July 22, 2016 8:59:03 PM PDT Francisco Jerez wrote:
>> The problem with the current approach is that driver output locations
>> are represented as a linear offset within the nir_outputs array, which
>> makes it rather difficult for the b
Reviewed-by: Francisco Jerez
Iago Toral Quiroga writes:
> ---
> src/mesa/drivers/dri/i965/brw_vec4.h | 5 +
> 1 file changed, 5 insertions(+)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_vec4.h
> b/src/mesa/drivers/dri/i965/brw_vec4.h
> index 3043147..afcf31e 10
Iago Toral Quiroga writes:
> Add asserts so we remember to address this when we enable 64-bit
> integer support, as suggested by Connor and Jason.
Reviewed-by: Francisco Jerez
> ---
> src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 70
> ++
>
Reviewed-by: Francisco Jerez
Iago Toral Quiroga writes:
> ---
> src/mesa/drivers/dri/i965/brw_disasm.c | 5 -
> 1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_disasm.c
> b/src/mesa/drivers/dri/i965/brw_disasm.c
> i
Iago Toral Quiroga writes:
> From: Connor Abbott
>
Reviewed-by: Francisco Jerez
> ---
> src/mesa/drivers/dri/i965/brw_reg.h | 6 ++
> 1 file changed, 6 insertions(+)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_reg.h
> b/src/mesa/drivers/dri/i965/brw_reg.h
Iago Toral Quiroga writes:
> From: Connor Abbott
>
Reviewed-by: Francisco Jerez
> ---
> src/mesa/drivers/dri/i965/brw_vec4.cpp | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp
> b/src/mesa/drivers/dri/i965/brw_
sa_values[instr->def.index] = dst_reg(VGRF, alloc.allocate(1));
> + nir_ssa_values[instr->def.index] =
> + dst_reg(VGRF, alloc.allocate(instr->def.bit_size / 32));
I think you want to use DIV_ROUND_UP here instead, with that fixed:
R
Iago Toral Quiroga writes:
> ---
> src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 18 ++
> 1 file changed, 18 insertions(+)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
> b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
> index 1525a3d..4014020 100644
> --- a/src/mesa/d
Iago Toral Quiroga writes:
> ---
> src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 8 +---
> 1 file changed, 5 insertions(+), 3 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
> b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
> index cf35f2e..fde7b60 100644
> --- a/src/me
Iago Toral Quiroga writes:
> Gen7 hardware does not support double immediates so these need
> to be moved in 32-bit chunks to a regular vgrf instead. Instead
> of doing this every time we need to create a DF immediate,
> create a helper function that does the right thing depending
> on the hardwa
Iago Toral Quiroga writes:
> Generally, instructions in Align16 mode only ever write to a single
> register and don't need anny form of SIMD splitting, that's why we
> have never had a SIMD splitting pass in the vec4 backend. However,
> double-precision instructions typically write 2 registers an
901 - 1000 of 3036 matches
Mail list logo