Abdiel Janulgue writes:
> Use the gather table generated from the uniform uploads and
> ir_binop_ubo_load to gather and pack the constants to the gather pool.
>
> Note that the 3DSTATE_CONSTANT_* packet now refers to the gather
> pool generated by the resource streamer instead of the constant buf
buffer index, but they don't have
> to be the same value they just happened to end up the same when binding is 0.
>
> Cc: Francisco Jerez
> Cc: Ilia Mirkin
> Cc: Alejandro Piñeiro
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90175
> ---
> src/gl
ze_t body_pos = source.find("COMP\n");
> + if (body_pos == std::string::npos) {
> + r_log = "invalid source";
> + throw compile_error();
> + }
> +
> const char *body = &source[body_pos];
> module m;
Reviewed-by: Francisco Jerez
>
&
Francisco Jerez writes:
> Abdiel Janulgue writes:
>
>> Use the gather table generated from the uniform uploads and
>> ir_binop_ubo_load to gather and pack the constants to the gather pool.
>>
>> Note that the 3DSTATE_CONSTANT_* packet now refers to the gather
>
Laurent Carlier writes:
> https://bugs.freedesktop.org/show_bug.cgi?id=92705
>
> Signed-off-by: Laurent Carlier
> ---
> src/gallium/state_trackers/clover/llvm/invocation.cpp | 5 +
> 1 file changed, 5 insertions(+)
>
> diff --git a/src/gallium/state_trackers/clover/llvm/invocation.cpp
> b/
Laurent Carlier writes:
> https://bugs.freedesktop.org/show_bug.cgi?id=92705
>
> v2: use Linker::Flags::None instead of 0
>
> Signed-off-by: Laurent Carlier
> ---
> src/gallium/state_trackers/clover/llvm/invocation.cpp | 6 +-
> 1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git
Laurent Carlier writes:
> https://bugs.freedesktop.org/show_bug.cgi?id=92705
>
> v2.1: use Linker::Flags::None instead of 0 and emplace_back()
>
Thanks,
Reviewed-by: Francisco Jerez
> Signed-off-by: Laurent Carlier
> ---
> src/gallium/state_trackers/clover/ll
Francisco Jerez writes:
> Chris Wilson writes:
>
>> On Sat, Oct 03, 2015 at 05:57:05PM +0300, Francisco Jerez wrote:
>>> Jordan Justen writes:
>>>
>>> > From: Francisco Jerez
>>> >
>>> > Fixes
>>> > arb_shader
Timothy Arceri writes:
> Cc: Francisco Jerez
> ---
> src/glsl/link_uniforms.cpp | 77
> +-
> 1 file changed, 49 insertions(+), 28 deletions(-)
>
> diff --git a/src/glsl/link_uniforms.cpp b/src/glsl/link_uniforms.cpp
> index
Timothy Arceri writes:
> Cc: Francisco Jerez
> ---
> src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 44
> --
> src/mesa/drivers/dri/i965/brw_nir_uniforms.cpp | 2 ++
> 2 files changed, 30 insertions(+), 16 deletions(-)
>
> diff --git a/s
surf_index = fs_reg(stage_prog_data->binding_table.ubo_start +
> - const_index->u[0]);
> +unsigned index = stage_prog_data->binding_table.ubo_start +
> + const_index->u[0];
const -- With these fixed:
Reviewed-by: Franc
Francisco Jerez writes:
> Iago Toral Quiroga writes:
>
>> Right now the generator marks direct surfaces as used but leaves marking of
>> indirect surfaces to the caller. Just make the callers handle marking in both
>> cases for consistency.
>> ---
>>
rog_data, surf_index);
> -
> } else {
>
>struct brw_reg addr = vec1(retype(brw_address_reg(0),
> BRW_REGISTER_TYPE_UD));
> @@ -1269,11 +1264,6 @@
> fs_generator::generate_uniform_pull_constant_load_gen7(fs_inst *inst,
>0);
>
>
set;
> - src_reg index =
> src_reg(prog_data->base.binding_table.pull_constants_start);
> + unsigned index = prog_data->base.binding_table.pull_constants_start;
> + src_reg surf_index = src_reg(index);
Again it doesn't seem particularly useful to
(prog_data->base.binding_table.ssbo_start +
> - ssbo_index);
> + unsigned index = prog_data->base.binding_table.ssbo_start + ssbo_index;
> + src_reg surf_index = src_reg(index);
"const unsigned index" and get rid of the useless
gt;bld.emit(inst);
> +
> + brw_mark_surface_used(prog_data, index);
If you make the same change as in the last patch:
Reviewed-by: Francisco Jerez
>break;
> }
>
> --
> 1.9.1
>
> __
Iago Toral Quiroga writes:
> Do it in the visitor, like we do for other opcodes.
Hm... I'm not 100% convinced of this and the texturing changes (patches
3 and 5). It definitely makes sense to do this explicitly in the
visitor for the pull constant and dataport surface opcodes, because they
tak
Iago Toral Quiroga writes:
> Right now some opcodes that only use constant surface indexing mark them as
> used in the generator while others do it in the visitor. When the opcode can
> handle both direct and indirect surface indexing then some opcodes handle
> only the constant part in the gener
Jordan Justen writes:
> On 2015-10-30 09:28:10, Matt Turner wrote:
>> On Fri, Oct 30, 2015 at 4:11 AM, Iago Toral Quiroga
>> wrote:
>> > Right now some opcodes that only use constant surface indexing mark them as
>> > used in the generator while others do it in the visitor. When the opcode
>>
image.reladdr, and replace while loop with a for loop. All suggested
> by Francisco Jerez.
>
> Cc: Francisco Jerez
> ---
> src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 30
> ++
> src/mesa/drivers/dri/i965/brw_nir_uniforms.cpp | 2 ++
> 2 fil
Matt Turner writes:
> This reverts commit bbf8239f92ecd79431dfa41402e1c85318e7267f.
>
> I didn't like that commit to begin with -- computing things at compile
> time is fine -- but for purposes of verifying that the resulting values
> are correct, looking up 0x00 and 0x30 in a table is a lot bett
Iago Toral writes:
> On Fri, 2015-10-30 at 16:19 +0200, Francisco Jerez wrote:
>> Iago Toral Quiroga writes:
>>
>> > Right now some opcodes that only use constant surface indexing mark them as
>> > used in the generator while others do it in the visitor. When
Matt Turner writes:
> Generated by
>
>sed -i -e 's/\.bits\././g' *.c *.h *.cpp
>sed -i -e 's/dw1\.//g' *.c *.h *.cpp
>
> and then reverting changes to comments in gen7_blorp.cpp and
> brw_fs_generator.cpp.
>
> There wasn't any utility offered by forcing the programmer to list these
> to a
Iago Toral writes:
> On Tue, 2015-11-03 at 15:28 +0200, Francisco Jerez wrote:
>> Iago Toral writes:
>>
>> > On Fri, 2015-10-30 at 16:19 +0200, Francisco Jerez wrote:
>> >> Iago Toral Quiroga writes:
>> >>
>> >> > Righ
Connor Abbott writes:
> Hi all,
>
> While working on FP64 for i965, there's an issue that I thought of
> with the vec4 backend that I'm not sure how to resolve. From what I
> understand, the execmask works the same way in Align16 mode as Align1
> mode, except that you only use the first 8 channel
Francisco Jerez writes:
> Connor Abbott writes:
>
>> Hi all,
>>
>> While working on FP64 for i965, there's an issue that I thought of
>> with the vec4 backend that I'm not sure how to resolve. From what I
>> understand, the execmask works the same
Timothy Arceri writes:
> V3: clamp array index to the correct size (the size of the current array
> rather than the inner array) Francisco Jerez.
>
> V2: avoid useless zero-initialization and addition for the first AoA level,
> avoid redundant temporary, make use of type_size_s
Matt Turner writes:
> On Tue, Nov 3, 2015 at 5:16 AM, Francisco Jerez wrote:
>> Matt Turner writes:
>>
>>> This reverts commit bbf8239f92ecd79431dfa41402e1c85318e7267f.
>>>
>>> I didn't like that commit to begin with -- computing things at
Jordan Justen writes:
> When these functions are called in GLSL code, we create an intrinsic
> function call:
>
> * groupMemoryBarrier => __intrinsic_group_memory_barrier
> * memoryBarrierAtomicCounter => __intrinsic_memory_barrier_atomic_counter
> * memoryBarrierBuffer => __intrinsic_memory_b
Jordan Justen writes:
> When these functions are called in glsl-ir, we create a corresponding
> nir intrinsic function call.
>
> Signed-off-by: Jordan Justen
Reviewed-by: Francisco Jerez
> ---
> src/glsl/nir/glsl_to_nir.cpp | 15 +++
> src/glsl/nir/
Iago Toral writes:
> On Tue, 2015-11-03 at 09:19 -0800, Mark Janes wrote:
>> Francisco Jerez writes:
>>
>> > Iago Toral writes:
>> >
>> >> On Tue, 2015-11-03 at 15:28 +0200, Francisco Jerez wrote:
>> >>> Iago Toral writes:
>>
pport for SVM is implemented.
In short, I think it's okay to leave this as a no-op but please add an
XXX comment explaining why it's not necessary to do anything so we
remember to revisit it when any of these three conditions changes.
>
> Signed-off-by: Jordan Justen
> Cc
Jordan Justen writes:
> When these functions are called in GLSL code, we create an intrinsic
> function call:
>
> * groupMemoryBarrier => __intrinsic_group_memory_barrier
> * memoryBarrierAtomicCounter => __intrinsic_memory_barrier_atomic_counter
> * memoryBarrierBuffer => __intrinsic_memory_b
Matt Turner writes:
> On Tue, Nov 3, 2015 at 5:48 AM, Francisco Jerez wrote:
>> Matt Turner writes:
>>
>>> Generated by
>>>
>>>sed -i -e 's/\.bits\././g' *.c *.h *.cpp
>>>sed -i -e 's/dw1\.//g' *.c *.h *.c
Jordan Justen writes:
> On 2015-11-05 06:07:02, Francisco Jerez wrote:
>> Jordan Justen writes:
>>
>> > When these functions are called in GLSL code, we create an intrinsic
>> > function call:
>> >
>> > * gro
ead of add_memory_barrier_function, add an intrinsic_name
>parameter to _memory_barrier (curro)
>
> Signed-off-by: Jordan Justen
> Cc: Francisco Jerez
Reviewed-by: Francisco Jerez
> ---
> src/glsl/builtin_functions.cpp | 55
> +-
nir intrinsics as no-ops:
> * nir_intrinsic_group_memory_barrier
> * nir_intrinsic_memory_barrier_shared
>
> v3:
> * Add comment for no-op cases (curro)
>
> Signed-off-by: Jordan Justen
> Cc: Francisco Jerez
> ---
> src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 11 +++
&g
Jan Vesely writes:
> On Sun, 2015-01-25 at 01:02 +0100, Niels Ole Salscheider wrote:
>> On Saturday 24 January 2015, 18:24:16, Jan Vesely wrote:
>> > On Sat, 2015-01-24 at 22:49 +0100, Niels Ole Salscheider wrote:
>> > > Since 8e7df519bd8556591794b2de08a833a67e34d526, we initialise all targets
>>
> > const&) (program.cpp:63)
> ==1936==by 0x5B20152: clBuildProgram (program.cpp:182)
> ==1936==by 0x400F41: main (hello_world.c:109)
>
> Signed-off-by: Michel Dänzer
Looks good,
Reviewed-by: Francisco Jerez
> ---
>
> v2: Just use target instead of target.begin
EdB writes:
> ---
> src/gallium/state_trackers/clover/llvm/invocation.cpp | 6 +++---
> 1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/src/gallium/state_trackers/clover/llvm/invocation.cpp
> b/src/gallium/state_trackers/clover/llvm/invocation.cpp
> index 10dbe08..510e195 10064
Kenneth Graunke writes:
> On Sunday, January 18, 2015 01:04:02 AM Francisco Jerez wrote:
>> This is the first part of a series meant to improve our usage of the L3
>> cache.
>> Currently it's far from ideal since the following objects aren't taking any
>>
This matches what _mesa_BindImageTextures() does. The derived image format
(gl_texture_image::TexFormat) isn't necessarily equivalent to the internal
format of the texture image. If a forbidden internal format has been
specified we need to mark the image unit as invalid as required by the spec,
r
---
src/mesa/main/uniform_query.cpp | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/src/mesa/main/uniform_query.cpp b/src/mesa/main/uniform_query.cpp
index 32870d0..82e5e38 100644
--- a/src/mesa/main/uniform_query.cpp
+++ b/src/mesa/main/uniform_query.cpp
@@ -362,7 +362,8 @@
This function will be useful for back-ends to translate an image internal
format as specified in GLSL code into a mesa format.
---
src/mesa/main/shaderimage.c | 10 +-
src/mesa/main/shaderimage.h | 7 +++
2 files changed, 12 insertions(+), 5 deletions(-)
diff --git a/src/mesa/main/sh
gl_texture_object::_MaxLevel doesn't have any meaningful value until
_mesa_test_texobj_completeness() has been run. Fixes the "level"
ARB_shader_image_load_store piglit test.
---
src/mesa/main/shaderimage.c | 7 ---
1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/src/mesa/main/s
---
src/mesa/main/shaderimage.c | 73 ++---
1 file changed, 42 insertions(+), 31 deletions(-)
diff --git a/src/mesa/main/shaderimage.c b/src/mesa/main/shaderimage.c
index f812073..005698c 100644
--- a/src/mesa/main/shaderimage.c
+++ b/src/mesa/main/shaderim
---
src/mesa/main/shaderimage.h | 8
1 file changed, 8 insertions(+)
diff --git a/src/mesa/main/shaderimage.h b/src/mesa/main/shaderimage.h
index 4aa859c..1c7d1e0 100644
--- a/src/mesa/main/shaderimage.h
+++ b/src/mesa/main/shaderimage.h
@@ -30,6 +30,10 @@
#include "glheader.h"
#includ
This is the required initial image unit state according to "Table 23.45. Image
State (state per image unit)" of the OpenGL 4.3 specification.
---
src/mesa/main/context.c | 2 ++
src/mesa/main/shaderimage.c | 13 +
src/mesa/main/shaderimage.h | 6 ++
3 files changed, 21 insert
There's no indication in the spec that the image unit state other than the
bound texture object shouldn't be updated when glBindImageTexture() is called
passing the zero texture as argument. It's very unlikely that any application
would ever have relied on this, but it's easy to get right, and it
Image memory qualifiers (coherent, volatile, restrict, readonly and writeonly)
follow slightly different rules from storage qualifiers, e.g. the uniqueness
rule doesn't apply. Make them a separate non-terminal.
---
src/glsl/glsl_parser.yy | 17 -
1 file changed, 16 insertions(+),
---
src/glsl/ast_to_hir.cpp | 12
1 file changed, 12 insertions(+)
diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp
index 1ba29f7..783384e 100644
--- a/src/glsl/ast_to_hir.cpp
+++ b/src/glsl/ast_to_hir.cpp
@@ -1596,6 +1596,18 @@ ast_expression::do_hir(exec_list *instruc
And rename _mesa_glsl_parse_state::early_fragment_tests to
fs_early_fragment_tests for consistency with other FS-specific flags in the
same struct.
---
src/glsl/ast_type.cpp | 2 +-
src/glsl/glsl_parser_extras.cpp | 4 +++-
src/glsl/glsl_parser_extras.h | 2 +-
src/glsl/linker.cpp
---
src/glsl/ast_to_hir.cpp | 11 +--
1 file changed, 9 insertions(+), 2 deletions(-)
diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp
index 783384e..1cfba39 100644
--- a/src/glsl/ast_to_hir.cpp
+++ b/src/glsl/ast_to_hir.cpp
@@ -2399,6 +2399,14 @@ apply_image_qualifier_to_va
The spec doesn't define any opaque type constructors.
---
src/glsl/ast_function.cpp | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/src/glsl/ast_function.cpp b/src/glsl/ast_function.cpp
index cbff9d8..151b082 100644
--- a/src/glsl/ast_function.cpp
+++ b/src/glsl/ast_funct
---
src/glsl/ast_to_hir.cpp | 14 ++
src/glsl/ast_type.cpp | 10 +-
src/glsl/glsl_parser.yy | 15 +++
3 files changed, 34 insertions(+), 5 deletions(-)
diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp
index 1cfba39..6db9365 100644
--- a/src/glsl/ast
Cubemap array images are unlike cubemap array samplers in that they don't need
an additional coordinate to index individual cubemaps in the array, instead
they behave like a 2D array of 6n layers, with n the number of cubemaps in the
array. Take this exception into account.
---
src/glsl/glsl_type
Ian Romanick writes:
> On 01/31/2015 09:54 PM, Francisco Jerez wrote:
>> ---
>> src/glsl/ast_to_hir.cpp | 12
>> 1 file changed, 12 insertions(+)
>>
>> diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp
>> index 1ba29f7..783384e
Ian Romanick writes:
> On 01/31/2015 09:54 PM, Francisco Jerez wrote:
>> Cubemap array images are unlike cubemap array samplers in that they don't
>> need
>> an additional coordinate to index individual cubemaps in the array, instead
>> they behave like a 2
Ian Romanick writes:
> On 02/01/2015 01:15 PM, Francisco Jerez wrote:
>> Ian Romanick writes:
>>
>>> On 01/31/2015 09:54 PM, Francisco Jerez wrote:
>>>> ---
>>>> src/glsl/ast_to_hir.cpp | 12
>>>> 1 file changed, 12 in
Eric Anholt writes:
> Francisco Jerez writes:
>
>> This reverts commit 3fad0868f023f1d726e230968a4df3327de38823.
>>
>> This test doesn't make any sense to me, it begins quoting the GLSL
>> 1.30 spec on the interaction of the discard keyword with control flo
Matt Turner writes:
> Prevents piglit regressions from the next patch.
> ---
> src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 37
> +-
> 1 file changed, 36 insertions(+), 1 deletion(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
> b/src/mesa/driver
Francisco Jerez writes:
> This is the first part of a series meant to improve our usage of the L3 cache.
> Currently it's far from ideal since the following objects aren't taking any
> advantage of it:
> - Pull constants (i.e. UBOs and demoted uniforms)
> - Buffer tex
width == 16);
>> + type = ST_FS16;
>> + written_type = ST_FS16_WRITTEN;
>> + reset_type = ST_FS16_RESET;
>> + }
>> + break;
>> + default:
>> + unreachable("fs_visitor::emit_shader_time_end missing code");
>>
One should be able to manipulate i965 IR without pulling the whole
FS/VEC4 visitor classes -- Optimization passes and other
transformations would ideally be visitor-agnostic. Among other issues
this avoids a circular dependency between the header file where such
visitor-agnostic code will be defin
It will also be useful in the VEC4 back-end.
---
src/mesa/drivers/dri/i965/brw_ir_fs.h | 1 -
src/mesa/drivers/dri/i965/brw_shader.h | 1 +
src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 1 +
3 files changed, 2 insertions(+), 1 deletion(-)
diff --git a/src/mesa/drivers/dri/i965
So regs_written gets initialized with a sensible value.
---
src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 11 +--
1 file changed, 5 insertions(+), 6 deletions(-)
diff --git a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
index babddee.
---
src/mesa/drivers/dri/i965/brw_vec4.cpp | 4 +++-
src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 11 +++
src/mesa/drivers/dri/i965/brw_vec4_vp.cpp | 1 +
3 files changed, 11 insertions(+), 5 deletions(-)
diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp
b/src/mesa/d
---
src/mesa/drivers/dri/i965/brw_ir_vec4.h | 2 ++
src/mesa/drivers/dri/i965/brw_vec4.cpp | 16
2 files changed, 18 insertions(+)
diff --git a/src/mesa/drivers/dri/i965/brw_ir_vec4.h
b/src/mesa/drivers/dri/i965/brw_ir_vec4.h
index ae024b3..f11a2d2 100644
--- a/src/mesa/driver
The only reason why you need a vec4_visitor to construct a
vec4_instruction is to initialize vec4_instruction::ir and
::annotation. Instead set them from vec4_visitor::emit() just like
fs_visitor does.
---
src/mesa/drivers/dri/i965/brw_ir_vec4.h| 3 +-
src/mesa/drivers/dri/i965/brw_vec4_
---
src/mesa/drivers/dri/i965/brw_fs.cpp | 8
1 file changed, 8 insertions(+)
diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index 8da1f47..e2ebf7e 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -
Due to the way it's implemented in hardware, the F16TO32/F32TO16
instructions require the source/destination register to be of some
16-bit type in Align1 mode, while they require it to be some 32-bit
type in Align16 mode (and as an undocumented feature the high 16 bits
of the destination register a
In preparation for some send from GRF instructions that will require
larger payloads.
---
src/mesa/drivers/dri/i965/brw_fs.h | 3 ---
src/mesa/drivers/dri/i965/brw_shader.h | 3 +++
src/mesa/drivers/dri/i965/brw_vec4_reg_allocate.cpp | 10 ++
3 files changed
The fs_visitor argument of fs_inst::regs_read() wasn't used at all.
---
src/mesa/drivers/dri/i965/brw_fs.cpp | 4 ++--
src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp| 4 ++--
src/mesa/drivers/dri/i965/brw_fs_dead_code_eliminate.cpp | 2 +-
src/mesa/drivers/dri/i965/b
Right now virtual GRF book-keeping and allocation is performed in each
visitor class separately (among other hundred different things),
leading to duplicated logic in each visitor and preventing layering as
it forces any code that manipulates i965 IR and needs to allocate
virtual registers to depen
---
src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 2 ++
1 file changed, 2 insertions(+)
diff --git a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
index 2cd185b..babddee 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
+++ b/src/mes
---
src/mesa/drivers/dri/i965/brw_ir_vec4.h | 1 +
src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp | 15 ++-
src/mesa/drivers/dri/i965/brw_vec4.cpp | 17 +
3 files changed, 28 insertions(+), 5 deletions(-)
diff --git a/src/mesa
---
src/mesa/drivers/dri/i965/brw_vec4.cpp | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp
b/src/mesa/drivers/dri/i965/brw_vec4.cpp
index c3f68e6..aaa4873 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4.cpp
+++ b/src/mesa/drivers/dri/i9
---
src/mesa/drivers/dri/i965/brw_vec4_reg_allocate.cpp | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/src/mesa/drivers/dri/i965/brw_vec4_reg_allocate.cpp
b/src/mesa/drivers/dri/i965/brw_vec4_reg_allocate.cpp
index 80735c3..46f0bfd 100644
--- a/src/mesa/drivers/dri/i965/br
---
.../drivers/dri/i965/brw_schedule_instructions.cpp | 32 --
1 file changed, 18 insertions(+), 14 deletions(-)
diff --git a/src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp
b/src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp
index 2b22b2c..24075bd 100644
--- a
It's expanded to several instructions.
---
src/mesa/drivers/dri/i965/brw_vec4.cpp | 3 +++
1 file changed, 3 insertions(+)
diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp
b/src/mesa/drivers/dri/i965/brw_vec4.cpp
index ed35c4b..85b9162f 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4.cpp
+++
Negation of UD/UW sources behaves the same as for D/W sources, taking
the two's complement of the source, except for bitwise logical
operations on Gen8 and up which take the one's complement. Fixes
crash in a GLSL shader with subtraction of two unsigned values.
---
.../drivers/dri/i965/brw_fs_cop
If the source type differs from the original type of the constant we
need to bit-cast it before propagating, otherwise the original type
information will be lost. If the constant was a vector float there
isn't much we can do, because the result of bit-casting the component
values of a vector float
Some instruction bits don't have a mapping defined to any compacted
instruction field. If they're ever set and we end up compacting the
instruction they will be forced to zero. Avoid using compaction in such
cases.
---
src/mesa/drivers/dri/i965/brw_eu_compact.c | 48 +
Scalar registers are required to have zero stride, fix the
regs_written calculation not to assume that the instruction writes
zero registers in that case.
---
src/mesa/drivers/dri/i965/brw_fs.cpp | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/src/mesa/drivers/dri/i965/brw_fs
Using 'ralloc*(this, ...)' is wrong if the object has automatic
storage or was allocated through any other means. Use normal dynamic
memory instead.
---
src/mesa/drivers/dri/i965/brw_fs.cpp | 68 +--
src/mesa/drivers/dri/i965/brw_ir_fs.h | 8 +++--
2 files change
The second one was inside an extern "C" block, luckily it was being
discarded by the preprocessor.
---
src/mesa/drivers/dri/i965/brw_fs.h | 1 -
1 file changed, 1 deletion(-)
diff --git a/src/mesa/drivers/dri/i965/brw_fs.h
b/src/mesa/drivers/dri/i965/brw_fs.h
index e789d25..ccd3da7 100644
--- a/
---
src/mesa/drivers/dri/i965/brw_fs.cpp | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index 163aa41..8da1f47 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.
Fixes metadata guess when instructions in the program specify a
destination register with non-zero reg_offset and when the payload of
a LOAD_PAYLOAD spans several registers.
---
src/mesa/drivers/dri/i965/brw_fs.cpp | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/src/mesa/dr
MRFs cannot be read from anyway so they cannot possibly be a valid
source of LOAD_PAYLOAD.
---
src/mesa/drivers/dri/i965/brw_fs.cpp | 39
1 file changed, 13 insertions(+), 26 deletions(-)
diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp
b/src/mesa/drivers/dr
We cannot zero out the destination register if it overlaps with the
source. Use an Align1 instruction instead to zero out the high 16
bits after the conversion to half float.
---
src/mesa/drivers/dri/i965/brw_eu_emit.c | 36 -
1 file changed, 27 insertions(+), 9 de
---
src/mesa/drivers/dri/i965/brw_vec4_dead_code_eliminate.cpp | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/src/mesa/drivers/dri/i965/brw_vec4_dead_code_eliminate.cpp
b/src/mesa/drivers/dri/i965/brw_vec4_dead_code_eliminate.cpp
index 9604e60..5df0d31 100644
--- a/src/mesa
---
src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 27 +++---
src/mesa/drivers/dri/i965/brw_reg.h| 22 +
2 files changed, 25 insertions(+), 24 deletions(-)
diff --git a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
b/src/mesa/drivers/dri/i96
Fixes rewrite by the register coalesce pass of references to
individual halves of 16-wide coalesced registers.
---
src/mesa/drivers/dri/i965/brw_fs_register_coalesce.cpp | 8 ++--
1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/src/mesa/drivers/dri/i965/brw_fs_register_coalesce.c
---
src/mesa/drivers/dri/i965/brw_vec4.cpp | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp
b/src/mesa/drivers/dri/i965/brw_vec4.cpp
index aaa4873..a4fd136 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4.cpp
+++ b/src/mesa/drivers/dri/i965
---
src/mesa/drivers/dri/i965/brw_ir_fs.h| 5 -
src/mesa/drivers/dri/i965/brw_shader.h | 5 +
src/mesa/drivers/dri/i965/brw_vec4.cpp | 11 +--
src/mesa/drivers/dri/i965/brw_vec4_generator.cpp | 1 +
src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
It doesn't really improve locality of texture fetches, quite the
opposite it's a waste of memory bandwidth and space due to tile
alignment.
---
src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 4
1 file changed, 4 insertions(+)
diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
b/src
Shaders with image uniforms may have side effects. Make sure that
fragment shader threads are dispatched if the shader has any image
uniforms.
---
src/mesa/drivers/dri/i965/gen7_wm_state.c | 6 +-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/src/mesa/drivers/dri/i965/gen7_wm_s
---
src/mesa/drivers/dri/i965/brw_program.c | 40 +
src/mesa/drivers/dri/i965/intel_reg.h | 1 +
2 files changed, 41 insertions(+)
diff --git a/src/mesa/drivers/dri/i965/brw_program.c
b/src/mesa/drivers/dri/i965/brw_program.c
index d9a3f05..793d963 100644
--- a
Reviewed-by: Paul Berry
---
src/mesa/drivers/dri/i965/brw_context.h | 5 +
src/mesa/drivers/dri/i965/brw_shader.cpp | 7 +++
2 files changed, 12 insertions(+)
diff --git a/src/mesa/drivers/dri/i965/brw_context.h
b/src/mesa/drivers/dri/i965/brw_context.h
index bebb0be..e28c65d 100644
--
This will be used to pass image meta-data to the shader when we cannot
use typed surface reads and writes. All entries except surface_idx
and size are otherwise unused and will get eliminated by the uniform
packing pass. size will be used for bounds checking with some image
formats and will be us
201 - 300 of 3036 matches
Mail list logo