"Serge Martin (EdB)" writes:
> On Wednesday 19 August 2015 11:56:08 Zoltan Gilian wrote:
>> There is no MDOperand in llvm 3.5.
>>
>> v2: Check if kernel metadata is present to avoid crash (EdB).
>> v3: Second attempt to avoid crash: switch off metadata query for llvm < 3.6.
>
> Since the change
intrinsic_image_size",
> - &builtin_builder::_image_size_prototype, 1,
> atom_flags);
> + &builtin_builder::_image_size_prototype, 1,
> + flags | IMAGE_FUNCTION_SUPPORTS_FLOAT_DATA_TYPE);
Reviewed-by: Francisco Jerez
> }
>
>
previously fixed in:
> commit: 781dc7c0e1f41502f18e07c0940af949a78d2792.
> However,
> commit: 259f7291de2387aa3ac5f856b39b7b934a1d8e7d
> removed the fix.
>
> Signed-off-by: Marta Lofstedt
Reviewed-by: Francisco Jerez
> ---
> src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 7 ---
> 1 file c
Ilia Mirkin writes:
> This should include everything. I sent a test for textureSamples to
> piglit a while ago, not sure how to test imageSamples -- apparently ms
> images aren't entirely supported on i965? But I'm not sure via what
> feat that happens.
>
i965 doesn't support MS images because on
Ilia Mirkin writes:
> On Fri, Aug 28, 2015 at 6:58 AM, Francisco Jerez
> wrote:
>> Ilia Mirkin writes:
>>
>>> This should include everything. I sent a test for textureSamples to
>>> piglit a while ago, not sure how to test imageSamples -- apparently ms
>
Matt Turner writes:
> On Fri, Aug 28, 2015 at 12:10 AM, Ilia Mirkin wrote:
>> On Fri, Aug 28, 2015 at 3:02 AM, Matt Turner wrote:
>>> On Thu, Aug 27, 2015 at 8:48 PM, Ilia Mirkin wrote:
Signed-off-by: Ilia Mirkin
---
src/glsl/builtin_functions.cpp | 48
++
Iago Toral writes:
> On Mon, 2016-09-12 at 14:05 -0700, Francisco Jerez wrote:
>> Iago Toral Quiroga writes:
>>
>> >
>> > We will use this in cases where we want to force the vstride of a
>> > src_reg
>> > to 0 to exploit a particular behavi
Iago Toral writes:
> On Mon, 2016-09-12 at 14:19 -0700, Francisco Jerez wrote:
>> Iago Toral Quiroga writes:
>>
>> >
>> > SIMD4x2 64bit data is stored in register space like this:
>> >
>> > r0.0:DF x0 y0 z0 w0
>> > r0.1:DF x1 y1
Jason Ekstrand writes:
> Just looking at the channel enables is not sufficient, at least not on Sky
> Lake. Channels that are disabled by the sample_mask may show up in the
> channel enable register as being enabled even if they are not executing.
> This can cause FIND_LIVE_CHANNEL to return a c
n Ekstrand
[ Francisco Jerez: Trivial simplification of brw_ud1_reg(). ]
Reviewed-by: Francisco Jerez
---
src/mesa/drivers/dri/i965/brw_fs.cpp | 2 +-
src/mesa/drivers/dri/i965/brw_reg.h | 20
2 files changed, 9 insertions(+), 13 deletions(-)
diff --git a/src/mesa/drivers/dri/i965/
This avoids emitting a few extra instructions required to take the
dispatch mask into account when it's known to be tightly packed.
---
src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 4 +++-
src/mesa/drivers/dri/i965/brw_vec4_generator.cpp | 8 ++--
2 files changed, 9 insertions(+), 3 dele
were disabled from the
beginning by taking the AND of ce0 with either sr0.2 or sr0.3 depending on
the shader stage. Failure to do so can result in FIND_LIVE_CHANNEL
returning a completely dead channel.
Signed-off-by: Jason Ekstrand
Cc: Francisco Jerez
[ Francisco Jerez: Fix a couple of typos, add
The eliminate_find_live_channel optimization eliminates
FIND_LIVE_CHANNEL instructions in cases where control flow is known to
be uniform, and replaces them with 'MOV 0', which in turn unblocks
subsequent elimination of the BROADCAST instruction frequently used on
the result of FIND_LIVE_CHANNEL.
Not intended for upstream. Should cause a GPU hang if some thread is
executed with a non-contiguous dispatch mask breaking assumptions of
brw_stage_has_packed_dispatch(). Doesn't cause any CTS, DEQP or
Piglit regressions, while replacing brw_stage_has_packed_dispatch()
with a dummy implementation
Jason Ekstrand writes:
> On Fri, Sep 16, 2016 at 3:03 PM, Francisco Jerez
> wrote:
>
>> The eliminate_find_live_channel optimization eliminates
>> FIND_LIVE_CHANNEL instructions in cases where control flow is known to
>> be uniform, and replaces them with
Jason Ekstrand writes:
> On Sep 16, 2016 3:04 PM, "Francisco Jerez" wrote:
>>
>> Not intended for upstream. Should cause a GPU hang if some thread is
>> executed with a non-contiguous dispatch mask breaking assumptions of
>> brw_stage_has_packed_dispat
Not intended for upstream. Should cause a GPU hang if some thread is
executed with a non-contiguous dispatch mask breaking assumptions of
brw_stage_has_packed_dispatch(). Doesn't cause any CTS, DEQP or
Piglit regressions, while replacing brw_stage_has_packed_dispatch()
with a dummy implementation
Vedran Miletić writes:
> OpenCL apps can quote arguments they pass to the OpenCL compiler, most
> commonly include paths containing spaces.
>
> If the Clang OpenCL compiler was called via a shell, the shell would
> split the arguments with respect to to quotes and then remove quotes
> before pass
Vedran Miletić writes:
> The options specified in the CLOVER_EXTRA_COMPILER_OPTIONS shell
> variable are appended to the compiler options specified by the OpenCL
> program, if any.
> Analogously, the options specified in the CLOVER_EXTRA_LINKER_OPTIONS
> variable are appended to the linker option
Ben Widawsky writes:
> All mobile parts (so far) are GT1. The check added extra confusion
> because it appeared Broxton was missing when it wasn't. Replace it with
> a comment.
>
> Alternatively, I'd be willing to add an is_broxton check.
>
> Cc: Francisco Jerez
Lionel Landwerlin writes:
> Curro: Ping? :)
Reviewed-by: Francisco Jerez
>
> On 26/09/16 20:02, Jason Ekstrand wrote:
>>
>> Looks good to me. Curro, do you see anything wrong with this?
>>
>> --Jason
>>
>>
>> On Sep 26, 2016 7:31 AM, "Li
bile cases.
>
> Cc: Francisco Jerez
> Signed-off-by: Ben Widawsky
> Reviewed-by: Anuj Phogat
Reviewed-by: Francisco Jerez
> ---
> src/intel/common/gen_l3_config.c | 4 +++-
> 1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/src/intel/common/gen_l3_c
"Juan A. Suarez Romero" writes:
> On Mon, 2016-08-08 at 16:12 +0200, Juan A. Suarez Romero wrote:
>> Hmm... what about the case of exec_size == 4 and writing just a
>> float?
>>
>> I understand in this case we only should mark one word, so the loop
>> should not be 2*inst->regs_written.
>>
>>
Iago Toral Quiroga writes:
> In the vec4 backend the generator sets the execution size for all
> instructions to 8, however, we will have to split certain DF instructions
> to have an execution size of 4, so we need to indicate this explicitly in the
> IR for the generator to set the right execut
Iago Toral Quiroga writes:
> From the HSW PRM, Command Reference, QtrCtrl:
>
>"NibCtrl is only allowed for SIMD4 instructions with a DF (Double Float)
> source or destination type."
>
> v2 (Samuel): Assert that the type is DF.
>
> Signed-off-by: Samuel Iglesias Gonsálvez
> ---
> src/mes
Iago Toral Quiroga writes:
> ---
> src/mesa/drivers/dri/i965/brw_vec4.cpp | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp
> b/src/mesa/drivers/dri/i965/brw_vec4.cpp
> index 829b7d3..88bf895 100644
> --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp
Iago Toral Quiroga writes:
> ---
> src/mesa/drivers/dri/i965/brw_disasm.c | 8 +++-
> 1 file changed, 7 insertions(+), 1 deletion(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_disasm.c
> b/src/mesa/drivers/dri/i965/brw_disasm.c
> index c8bdeab..d5e9916 100644
> --- a/src/mesa/drivers/dr
Kenneth Graunke writes:
> On Wednesday, July 20, 2016 9:49:36 PM PDT Francisco Jerez wrote:
>> Both MESA_shader_framebuffer_fetch_non_coherent and the non-coherent
>> variant of KHR_blend_equation_advanced will use this driver hook to
>> request coherency between framebu
This simplifies the code slightly and will allow the SIMD lowering
pass to find out easily what the actual texturing opcode is in order
to determine the maximum execution size of texturing instructions.
---
src/mesa/drivers/dri/i965/brw_fs.cpp | 10 --
src/mesa/drivers/dri/i965/brw_fs_
This makes it easier for the caller to find out how many scalar
components are actually read by the instruction. As a bonus we no
longer need to special-case BAD_FILE in the implementation of
fs_inst::regs_read.
---
src/mesa/drivers/dri/i965/brw_fs.cpp | 7 +--
1 file changed, 5 insertions(+)
The current logic used to determine the execution size of sampler
messages was based on special-casing several argument and opcode
combinations, which unsurprisingly missed the possibility that some
messages could exceed the payload size limit or not depending on the
number of coordinate components
Kenneth Graunke writes:
> On Friday, August 12, 2016 10:06:29 PM PDT Francisco Jerez wrote:
>> The current logic used to determine the execution size of sampler
>> messages was based on special-casing several argument and opcode
>> combinations, which unsurprisingly missed
This adds a bit of metadata to schedule_node that will be used to
compare available nodes in the scheduling heuristic code based on
which of them unblocks the earliest successor exit node. Note that
assigning exit nodes wouldn't be necessary in a bottom-up scheduler
because we could achieve the sa
This series contains two independent discard-related optimizations:
PATCH 1-2 change the i965 back-end to do per-subspan instead of
per-SIMD-thread discard jumps, which can save bandwidth and ALU cycles
in some scenarios. This improves the FPS of a very simple
microbenchmark that does a costly tex
The critical path of each node is calculated by induction based on the
critical paths of its children, which can be done in a post-order
depth-first traversal of the dependency graph. The current code
implements graph traversal by iterating over all nodes of the graph
and then recursing into its c
This uses the unblocked time of the exit assigned to each available
node to attempt to unblock exit nodes as early as possible,
potentially reducing the runtime of the shader when an exit branch is
taken. There is a natural trade-off between terminating the program
as early as possible and reducin
ANY4H is more efficient than ANY8H and ANY16H because it makes sure
that whenever a whole subspan hits a discard statement it gets
disabled by the EU until the end of the program, regardless of whether
the discard condition is uniform across all channels of the SIMD8-16
thread. OTOH ANY8H/ANY16H w
This may have been the reason people ran into problems with
non-uniform HALT instructions and ended up using the inefficient
ANY16H/ANY8H predicates instead of ANY4H or NORMAL in order to prevent
non-uniform discard. The HALT instruction is able to handle
non-uniform execution masks just fine.
---
Kenneth Graunke writes:
> Note that _mesa_BlendBarrierMESA is not currently hooked up in the
> glapi XML, so we can just rename it. We'll hook it up for the
> KHR_blend_equation_advanced extension shortly.
>
> XXX: sort out exactly what patches Curro plans to push and when.
>
FTR since we discu
Kenneth Graunke writes:
> From: Ilia Mirkin
>
> Signed-off-by: Ilia Mirkin
> Reviewed-by: Kenneth Graunke
Reviewed-by: Francisco Jerez
> ---
> src/compiler/glsl/glsl_parser_extras.cpp | 1 +
> src/compiler/glsl/glsl_parser_extras.h | 2 ++
> 2 files changed,
Kenneth Graunke writes:
> From: Ilia Mirkin
>
> v2 (Ken): Fix enum values, drop _mesa_BlendBarrierKHR stub as Curro has
> already implemented it.
>
> Signed-off-by: Ilia Mirkin
> Signed-off-by: Kenneth Graunke
Reviewed-by: Francisco Jerez
> ---
> sr
Kenneth Graunke writes:
> From: Ilia Mirkin
>
> v2 (Ken): Add a BLEND_NONE enum value (no qualifiers in use).
>
> Signed-off-by: Ilia Mirkin
> Reviewed-by: Kenneth Graunke
Reviewed-by: Francisco Jerez
> ---
> src/compiler/glsl/ast.h | 5 +
Kenneth Graunke writes:
> Since each qualifier represents a blending mode the shader can be used
> with, we take the union of all possible modes when linking.
>
> Signed-off-by: Kenneth Graunke
Reviewed-by: Francisco Jerez
> ---
> src/compiler/glsl/linker.cpp | 2 ++
&g
Kenneth Graunke writes:
> We're going to handle output qualifiers here too, and calling it "inout"
> seems to be the going convention.
>
> Signed-off-by: Kenneth Graunke
Reviewed-by: Francisco Jerez
> ---
> src/compiler/glsl/linker.cpp | 4 ++--
>
Kenneth Graunke writes:
> This will be used for emulating GL_KHR_advanced_blend_equation features
> in shader code. We'll pass in the blending mode that's in use, and use
> that in (effectively) a switch statement in the shader.
>
> Signed-off-by: Kenneth Graunke
Rev
Kenneth Graunke writes:
> Don't allow them in glBlendEquationSeparate[i], though, as required
> by the spec.
>
> Signed-off-by: Kenneth Graunke
Reviewed-by: Francisco Jerez
> ---
> src/mesa/main/blend.c | 64
> +++
Kenneth Graunke writes:
> This adds the extension enable (so drivers can advertise it) and the
> extra boolean state flag, GL_BLEND_ADVANCED_COHERENT_KHR, which can
> be set to request coherent blending.
>
> Signed-off-by: Kenneth Graunke
Reviewed-by: Francisco Jerez
> --
Kenneth Graunke writes:
> We always use a coherent read, and ignore the "opt out" enable flag.
>
> Signed-off-by: Kenneth Graunke
Reviewed-by: Francisco Jerez
> ---
> src/mesa/drivers/dri/i965/intel_extensions.c | 1 +
> 1 file changed, 1 insertion(+)
>
>
Kenneth Graunke writes:
> Signed-off-by: Kenneth Graunke
Reviewed-by: Francisco Jerez
> ---
> src/mesa/drivers/dri/i965/brw_link.cpp | 2 ++
> src/mesa/drivers/dri/i965/intel_extensions.c | 4 +++-
> 2 files changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a
Jason Ekstrand writes:
> On Tue, Aug 16, 2016 at 1:54 PM, Francisco Jerez
> wrote:
>
>> This may have been the reason people ran into problems with
>> non-uniform HALT instructions and ended up using the inefficient
>> ANY16H/ANY8H predicates instead of ANY4H or
Jason Ekstrand writes:
> On Tue, Aug 16, 2016 at 1:54 PM, Francisco Jerez
> wrote:
>
>> This adds a bit of metadata to schedule_node that will be used to
>> compare available nodes in the scheduling heuristic code based on
>> which of them unblocks the earliest succ
Kenneth Graunke writes:
> Previously, the scalar TCS backend was generating:
>
> mov(8) g17<1>UD 0xUD{ align1 WE_all 1Q compacted };
> and(8) g17.2<1>UD g0.2<0,1,0>UD 0x0001e000UD { align1 WE_all 1Q };
> shl(8) g17.2<1>UD g17.2<8,8,1>UD 0x000bUD { align1 WE_all
Iago Toral writes:
> On Tue, 2016-08-02 at 18:40 -0700, Francisco Jerez wrote:
>> Iago Toral Quiroga writes:
>>
>> >
>> > ---
>> > src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 8 +---
>> > 1 file changed, 5 insertions(+), 3 deletions(-
Iago Toral writes:
> On Tue, 2016-08-02 at 18:27 -0700, Francisco Jerez wrote:
>> Iago Toral Quiroga writes:
>>
>> >
>> > ---
>> > src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 18 ++
>> > 1 file changed, 18 insertions(+
gateway (barrier msg) mlen 1 rlen 0 { align1 WE_all 1Q };
>
> Using component() accomplishes this.
>
> Fixes GL44-CTS.tessellation_shader.tessellation_shader_tc_barriers.
> barrier_guarded_read_write_calls on Skylake. Probably fixes other
> barrier issues on Gen8+.
>
> v
Matt Turner writes:
> ... instead of assert failing. Can only happen when the program has an
> unconditional infinite loop.
I'm curious how the framebuffer write gets eliminated, I don't think DCE
is smart enough currently to find out that the FB write is unreachable?
> ---
> Sigh.
>
> src/mes
Francisco Jerez writes:
> Matt Turner writes:
>
>> ... instead of assert failing. Can only happen when the program has an
>> unconditional infinite loop.
>
> I'm curious how the framebuffer write gets eliminated, I don't think DCE
> is smart enough curre
Iago Toral writes:
> On Wed, 2016-08-17 at 15:15 -0700, Francisco Jerez wrote:
>> Iago Toral writes:
>>
>> >
>> > On Tue, 2016-08-02 at 18:27 -0700, Francisco Jerez wrote:
>> > >
>> > > Iago Toral Quiroga writes:
>> > >
Iago Toral writes:
> On Mon, 2016-08-08 at 15:26 -0700, Francisco Jerez wrote:
>> Iago Toral Quiroga writes:
>>
>> >
>> > From the HSW PRM, Command Reference, QtrCtrl:
>> >
>> > "NibCtrl is only allowed for SIMD4 instructions with
Kenneth Graunke writes:
> Many GPUs cannot handle GL_KHR_blend_equation_advanced natively, and
> need to emulate it in the pixel shader. This lowering pass implements
> all the necessary math for advanced blending. It fetches the existing
> framebuffer value using the MESA_shader_framebuffer_fe
is generated by [...]
Francisco Jerez writes:
> Kenneth Graunke writes:
>
>> Don't allow them in glBlendEquationSeparate[i], though, as required
>> by the spec.
>>
>> Signed-off-by: Kenneth Graunke
>
> Reviewed-by:
Iago Toral writes:
> On Mon, 2016-08-08 at 15:58 -0700, Francisco Jerez wrote:
>> Iago Toral Quiroga writes:
>>
>> >
>> > ---
>> > src/mesa/drivers/dri/i965/brw_disasm.c | 8 +++-
>> > 1 file changed, 7 insertions(+), 1 deletion(-
).
> */
> if (src->file == VGRF) {
> - if (var_range_end(var_from_reg(alloc, *src), 4) < ip) {
> + if (var_range_end(var_from_reg(alloc, dst_reg(*src)), 4) <
> ip) {
>
ding mode as a state var.
>
> Signed-off-by: Kenneth Graunke
Reviewed-by: Francisco Jerez
> ---
> src/mesa/main/blend.c | 54
> ++
> src/mesa/main/mtypes.h | 9 +
> 2 files changed, 50 insertions(+), 13 deletions(-)
gt; Signed-off-by: Kenneth Graunke
Reviewed-by: Francisco Jerez
> ---
> src/mesa/main/barrier.c | 2 +-
> src/mesa/main/barrier.h | 2 +-
> 2 files changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/src/mesa/main/barrier.c b/src/mesa/main/barrier.c
> index 42a5e0f.
to drop some GL type usage, and drop the
> unnecessary "_mesa_" prefix on a static function.
>
> Signed-off-by: Kenneth Graunke
Reviewed-by: Francisco Jerez
> ---
> src/mesa/main/context.c | 30 +++---
> 1 file changed, 15 insertions(+)
Kenneth Graunke writes:
> From: Ilia Mirkin
>
> Signed-off-by: Ilia Mirkin
> Reviewed-by: Kenneth Graunke
Reviewed-by: Francisco Jerez
> ---
> src/mesa/main/extensions_table.h | 1 +
> src/mesa/main/mtypes.h | 1 +
> 2 files changed, 2 insertions(+)
&
field.
>
> Signed-off-by: Kenneth Graunke
> Reviewed-by: Francisco Jerez [v1]
v2 is still:
Reviewed-by: Francisco Jerez
> ---
> src/mesa/program/prog_statevars.c | 10 ++
> src/mesa/program/prog_statevars.h | 5 +
> 2 files changed, 15 insertions(+)
>
>
Kenneth Graunke writes:
> We'll do blending in the shader in this case, so just disable the
> hardware blending.
>
> Signed-off-by: Kenneth Graunke
Reviewed-by: Francisco Jerez
> ---
> src/mesa/drivers/dri/i965/brw_cc.c | 2 +-
>
, and non-vec4
> variables (it's easier than finding spec text to justify not
> handling it). Thanks to Francisco Jerez for the feedback.
>
> Signed-off-by: Kenneth Graunke
> ---
> src/compiler/Makefile.sources | 1 +
> src/compiler/gl
gl_SecondaryFragColorEXT should have the same location as gl_FragColor
for the secondary fragment color to be replicated to all fragment
outputs. The incorrect location of gl_SecondaryFragColorEXT would
cause the linker to mark both FRAG_RESULT_COLOR and FRAG_RESULT_DATA0
as being written to, whic
This would cause gl_FragStencilRef to be counted as a color output
incorrectly during the precompile phase, which leads to unnecessary
recompilation on master and could trigger an assertion failure in
fs_visitor::emit_fb_writes() on my i965-fb-fetch branch.
---
src/mesa/drivers/dri/i965/brw_wm.c |
In the fragment shader OutputsWritten is a bitset of FRAG_RESULT_*
enumerants, which represent the location of each color output written
by the shader. The secondary and primary color outputs of a given
render target using dual-source blending have the same location, so
the 'idx' computation below
t *brw,
>struct brw_image_param *param)
> {
> struct gl_buffer_object *obj = u->TexObj->BufferObject;
> -
> + uint32_t size;
Fold the computation of size below into its declaration so you can mark
the variable const. With that fixed:
Reviewed-by: Francisc
Ilia Mirkin writes:
> On Mon, Aug 22, 2016 at 9:59 PM, Francisco Jerez
> wrote:
>> gl_SecondaryFragColorEXT should have the same location as gl_FragColor
>> for the secondary fragment color to be replicated to all fragment
>> outputs. The incorrect location of gl_Sec
Ilia Mirkin writes:
> On Tue, Aug 23, 2016 at 12:05 AM, Francisco Jerez
> wrote:
>> Ilia Mirkin writes:
>>
>>> On Mon, Aug 22, 2016 at 10:55 PM, Francisco Jerez
>>> wrote:
>>>> Ilia Mirkin writes:
>>>>
>>
*sh_prog =
> ctx->_Shader->_CurrentFragmentProgram;
> + struct gl_shader_info *fs_info =
> + &sh_prog->_LinkedShaders[MESA_SHADER_FRAGMENT]->info;
Don't you need a NULL check here to avoid a crash if
there is no valid fragment program bound to the pipeline?
Ilia Mirkin writes:
> On Mon, Aug 22, 2016 at 10:55 PM, Francisco Jerez
> wrote:
>> Ilia Mirkin writes:
>>
>>> On Mon, Aug 22, 2016 at 9:59 PM, Francisco Jerez
>>> wrote:
>>>> gl_SecondaryFragColorEXT should have the same location as gl_FragC
Kenneth Graunke writes:
> On Monday, August 22, 2016 5:50:49 PM PDT Francisco Jerez wrote:
>> Kenneth Graunke writes:
>>
>> > Many GPUs cannot handle GL_KHR_blend_equation_advanced natively, and
>> > need to emulate it in the pixel shader. This lowering pass
Ilia Mirkin writes:
> On Tue, Aug 23, 2016 at 12:18 AM, Francisco Jerez
> wrote:
>> Ilia Mirkin writes:
>>
>>> On Tue, Aug 23, 2016 at 12:05 AM, Francisco Jerez
>>> wrote:
>>>> Ilia Mirkin writes:
>>>>
>>>>> O
Francisco Jerez writes:
> Ilia Mirkin writes:
>
>> On Tue, Aug 23, 2016 at 12:18 AM, Francisco Jerez
>> wrote:
>>> Ilia Mirkin writes:
>>>
>>>> On Tue, Aug 23, 2016 at 12:05 AM, Francisco Jerez
>>>> wrote:
>>>>> Il
Iago Toral Quiroga writes:
> So we can access it in the vec4 backend to handle byte offsets into
> registers.
This change has deep implications in the meaning of the vec4 register
objects because the representation of register offsets is now split
between 'reg_offset' and 'subreg_offset', and th
Ilia Mirkin writes:
> I had trouble getting these to apply, perhaps they were meant to go on
> top of something else. Anyways, should be fairly easy for you to test
> out with llvmpipe.
>
It should apply cleanly on latest master now.
>[...]
signature.asc
Description: PGP signature
Iago Toral writes:
> On Tue, 2016-08-23 at 12:58 -0700, Francisco Jerez wrote:
>> Iago Toral Quiroga writes:
>>
>> >
>> > So we can access it in the vec4 backend to handle byte offsets into
>> > registers.
>> This change has deep implication
Ilia Mirkin writes:
> On Wed, Aug 24, 2016 at 4:30 PM, Francisco Jerez
> wrote:
>> Ilia Mirkin writes:
>>
>>> I had trouble getting these to apply, perhaps they were meant to go on
>>> top of something else. Anyways, should be fairly easy for you to t
Ilia Mirkin writes:
> On Thu, Aug 25, 2016 at 12:45 AM, Francisco Jerez
> wrote:
>> Ilia Mirkin writes:
>>
>>> On Wed, Aug 24, 2016 at 4:30 PM, Francisco Jerez
>>> wrote:
>>>> Ilia Mirkin writes:
>>>>
>>>>> I had tr
; - Haswell (shader.py with master)
> - Haswell (shader.py and forcing dual-instanced mode for GS with master)
>
> It would still probably be a good idea to run it through Jenkins just in case.
>
Ran through Jenkins and:
Reviewed-by: Francisco Jerez
> ---
> src/mesa/drivers/dri
Jan Vesely writes:
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97513
> Signed-off-by: Jan Vesely
Reviewed-by: Francisco Jerez
> ---
> src/gallium/state_trackers/clover/api/device.cpp | 2 +-
> src/gallium/state_trackers/clover/core/device.cpp | 6 ++
---
src/compiler/glsl/ir_set_program_inouts.cpp | 9 +++--
src/mesa/main/mtypes.h | 1 +
2 files changed, 8 insertions(+), 2 deletions(-)
diff --git a/src/compiler/glsl/ir_set_program_inouts.cpp
b/src/compiler/glsl/ir_set_program_inouts.cpp
index fcfbcd4..06d9973 100644
gl_SecondaryFragColorEXT should have the same location as gl_FragColor
for the secondary fragment color to be replicated to all fragment
outputs. The incorrect location of gl_SecondaryFragColorEXT would
cause the linker to mark both FRAG_RESULT_COLOR and FRAG_RESULT_DATA0
as being written to, whic
Currently the mesa state tracker relies on there being two bits set
per dual-source output in the gl_program::OutputsWritten bitset, but
that only worked due to a GLSL front-end bug that caused it to set the
OutputsWritten bit for both location and location+1 even though at the
GLSL level the prima
In the fragment shader OutputsWritten is a bitset of FRAG_RESULT_*
enumerants, which represent the location of each color output written
by the shader. The secondary and primary color outputs of a given
render target using dual-source blending have the same location, so
the 'idx' computation below
somewhat skeptical that the result of getTypeAllocSize matches
the expected API size in all cases, but I don't have any better
suggestions for the moment, so patch is:
Acked-by: Francisco Jerez
>
> src/gallium/state_trackers/clover/llvm/codegen/common.cpp | 7 +++
> 1 file
> Fixes CTS tests:
>
> *.tessellation_shader.compilation_and_linking_errors.
> {tc,te}_invalid_array_size_used_for_input_blocks
>
> Piglit's tcs-input-read-nonconst-* tests would be broken by this patch,
> but the tests are wrong. I've submitted a patch to fix those.
>
> Signed-off-by: Kenneth Graunke
The fs_reg::subreg_offset and ::offset fields are now redundant, the
sub-GRF offset can just be added to the single ::offset field
expressed in byte units. The current subreg_offset value can be
recovered by applying the following rule: Replace each rvalue
reference of subreg_offset like 'x = r.su
This series reworks the representation of register region offsets in
the i965 IR to be universally byte-based instead of the rather awkward
split between reg_offset and subreg_offset we have in the FS back-end
right now, or the reg_offset field currently used in the VEC4 IR which
doesn't allow bett
The dst/src_reg::offset field in byte units introduced in the previous
patch is a more straightforward alternative to an offset
representation split between ::reg_offset and ::subreg_offset fields.
The split representation makes it too easy to forget about one of the
offsets while dealing with the
The fs_reg::offset field in byte units introduced in this patch is a
more straightforward alternative to the current register offset
representation split between fs_reg::reg_offset and ::subreg_offset.
The split representation makes it too easy to forget about one of the
offsets while dealing with
---
src/mesa/drivers/dri/i965/brw_shader.cpp | 2 --
src/mesa/drivers/dri/i965/brw_shader.h | 15 ++-
2 files changed, 2 insertions(+), 15 deletions(-)
diff --git a/src/mesa/drivers/dri/i965/brw_shader.cpp
b/src/mesa/drivers/dri/i965/brw_shader.cpp
index 29435f6..e599235 100644
--
The previous regs_read value can be recovered by rewriting each
reference of regs_read() like 'x = i.regs_read(j)' to 'x =
DIV_ROUND_UP(i.size_read(j), reg_unit)'.
For the same reason as in the previous patches, this doesn't attempt
to be particularly clever about simplifying the result in the int
601 - 700 of 3036 matches
Mail list logo