In addition, let intel_miptree_create_layout() release the
miptree - it is the allocator.
CC: Jason Ekstrand
Signed-off-by: Topi Pohjolainen
---
src/mesa/drivers/dri/i965/brw_tex_layout.c| 10 +-
src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 5 -
src/mesa/drivers/dri/i965/int
On Mon, Jan 16, 2017 at 08:33:09AM -0800, Jason Ekstrand wrote:
>On Mon, Jan 16, 2017 at 1:13 AM, Topi Pohjolainen
><[1]topi.pohjolai...@gmail.com> wrote:
>
> There exact same check earlier in brw_miptree_layout() which
> intel_miptree_create_layout() in turn calls unconditionall
On Tuesday, January 17, 2017 8:43:28 AM PST Eduardo Lima Mitev wrote:
> If there is no plan to use brw_print_program_cache elsewhere, I would
> rather keep it a static method where it is used. In general, I prefer
> not polluting header files. Not a big deal anyway; feel free to ignore
> the commen
Commit 42011be1e disabled HiZ when sharing depth buffer externally,
which free HiZ buffer.
But in emit_depth_packets() we use that buffer, which generates a crash
in
"piglit.spec.egl_khr_gl_image.egl_khr_gl_renderbuffer_image-clear-shared-image
gl_depth_component24" test when running in Skylake.
Reviewed-by: Eduardo Lima Mitev
On 01/17/2017 08:14 AM, Kenneth Graunke wrote:
> We have a persistent mapping. Don't map it a second time or try to
> unmap it. Just use the pointer.
>
> This most likely would wreak havoc except that this code is unused
> (it's only called from an if (0) debug
Patch 5-7 look good, but I prefer that more experienced eyes take a look
too.
Acked-by: Eduardo Lima Mitev
On 01/17/2017 08:14 AM, Kenneth Graunke wrote:
> The non-LLC story was a horror show. We uploaded data via pwrite
> (drm_intel_bo_subdata), which would stall if the cache BO was in
> use (
From: "Juan A. Suarez Romero"
Previous to Broadwell, we have 8 registers for MOV_INDIRECT.
According to the IVB and HSW PRMs:
"2.When the destination requires two registers and the sources are
indirect, the sources must use 1x1 regioning mode. In addition, the
sources must be assembled from G
From: Matt Turner
On HSW+, scalar DF sources can be accessed using the normal <0,1,0>
region, but on IVB and BYT DF regions must be programmed in terms of
floats. A <0,2,1> region accomplishes this.
---
src/mesa/drivers/dri/i965/brw_eu_emit.c | 26 --
1 file changed, 20 i
From: Iago Toral Quiroga
4-wide DF operations where NibCtrl applies require and execsize of 8
in IvyBridge/BayTrail.
v2:
- Refactor NibCtrl printing (Matt)
Reviewed-by: Matt Turner
---
src/mesa/drivers/dri/i965/brw_disasm.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --g
From: Matt Turner
Doing so allows us to use a single MOV in VEC4_OPCODE_TO_DOUBLE instead
of two.
---
src/mesa/drivers/dri/i965/brw_eu_emit.c | 28 +++-
src/mesa/drivers/dri/i965/brw_vec4_generator.cpp | 13 +--
2 files changed, 28 insertions(+), 13 deletions
Then the SIMD lowering pass will get rid of any compressed instructions with
scalar
source (whether force_writemask_all or not) and we avoid hitting the Gen7 region
decompression bug.
Signed-off-by: Samuel Iglesias Gonsálvez
Suggested-by: Francisco Jerez
---
src/mesa/drivers/dri/i965/brw_fs.cp
From: "Juan A. Suarez Romero"
In IVB and BYT, both regioning parameters and execution sizes are measured as
floats.
So when we have something like:
mov(8) g2<1>DF g3<4,4,1>DF
We are not actually moving 8 doubles (our intention), but 4 doubles.
We need to double the parameters to cope with thi
From: "Juan A. Suarez Romero"
The execution data size is the biggest type size of any instruction
operand.
We will use it to know if the instruction deals with DF, because in Ivy
we need to double the execution size and regioning parameters.
v2:
- Fix typo in commit log (Matt)
- Use static inli
From: "Juan A. Suarez Romero"
When converting a DF to F, we set dst stride to 2, to fulfil alignment
restrictions.
But in IVB/BYT, this is not necessary, as each DF conversion already
writes 2 F, the first one the real value, and the second one a 0. That
is, IVB/BYT already set stride = 2 implic
From: "Juan A. Suarez Romero"
When splitting VEC4_OPCODE_FROM_DOUBLE in Ivybridge/Baytrail, the second
part should use a temporal register, and then move the values to the
second half of the original destination, so we get all the results in the
same register.
v2:
- Fix typos (Matt).
---
src/me
It is tested empirically that IVB/BYT don't support indirect addressing
with doubles but it is not documented in the PRM.
This patch applies the same solution than for Cherryview/Broxton and
takes into account that we cannot double the stride, since the
hardware will do it internally.
v2:
- Fix a
The hardware applies the same channel enable signals to both halves of
the compressed instruction which will be just wrong under non-uniform
control flow. Fix this by splitting those instructions to SIMD4.
Signed-off-by: Samuel Iglesias Gonsálvez
---
src/mesa/drivers/dri/i965/brw_fs.cpp | 9
From: "Juan A. Suarez Romero"
Keep the original type when dealing with null registers. Specially
because we do no want to introduce an implicit conversion between
types that could affect the conditional flags.
This affects specially when the original type is DF, and we are working
on Ivybridge/B
We need to split DF instructions in two on IVB/BYT as it needs an
execsize 8 to process 4 DF values (one GRF in total).
v2:
- Rename helper and make it static inline function (Matt).
- Fix indention and add braces (Matt).
Signed-off-by: Samuel Iglesias Gonsálvez
---
src/mesa/drivers/dri/i965/br
From: "Juan A. Suarez Romero"
In the generator we must generate slightly different code for
Ivybridge/Baytrail, because of the way the stride works in
this hardware.
---
src/mesa/drivers/dri/i965/brw_vec4_generator.cpp | 17 -
1 file changed, 16 insertions(+), 1 deletion(-)
diff
On Tue, Jan 17, 2017 at 10:40:26AM +0100, Eduardo Lima Mitev wrote:
> Patch 5-7 look good, but I prefer that more experienced eyes take a look
> too.
>
> Acked-by: Eduardo Lima Mitev
>
> On 01/17/2017 08:14 AM, Kenneth Graunke wrote:
> > The non-LLC story was a horror show. We uploaded data via
Signed-off-by: Samuel Iglesias Gonsálvez
---
docs/features.txt | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/docs/features.txt b/docs/features.txt
index aff00167dc9..c746277678c 100644
--- a/docs/features.txt
+++ b/docs/features.txt
@@ -107,7 +107,7 @@ GL 3.3, GLSL 3.30
From: "Juan A. Suarez Romero"
Take in account the offset value when getting the var from register.
This is required when dealing with an operation that writes half of the
register (like one d2x in IVB/BYT, which uses exec_size == 4).
Note that for live analysis variables we need to stick to per
Signed-off-by: Samuel Iglesias Gonsálvez
---
src/mesa/drivers/dri/i965/intel_extensions.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/src/mesa/drivers/dri/i965/intel_extensions.c
b/src/mesa/drivers/dri/i965/intel_extensions.c
index b674b2f494c..69fb09813ee 100644
--- a/sr
Add a new setup_imm_df() that allows the insertion of the instructions
before another one. This will be used in the lowering passes for DF
instructions.
v2:
- Adapt emission of DIM instruction too.
Signed-off-by: Samuel Iglesias Gonsálvez
---
src/mesa/drivers/dri/i965/brw_vec4.h | 2 ++
Signed-off-by: Samuel Iglesias Gonsálvez
---
src/mesa/drivers/dri/i965/intel_extensions.c | 2 ++
src/mesa/drivers/dri/i965/intel_screen.c | 6 --
2 files changed, 6 insertions(+), 2 deletions(-)
diff --git a/src/mesa/drivers/dri/i965/intel_extensions.c
b/src/mesa/drivers/dri/i965/intel
From: "Juan A. Suarez Romero"
When splitting a CMP/MOV instruction with NULL dest, DF sources, and
conditional modifier; we can't use directly the flag registers, as they will
have the wrong results in IVB/BYT after the scalarization.
Rather, we need to store the result in a temporary register,
Reviewed-by: Nicolai Hähnle
On 16.01.2017 21:17, Bas Nieuwenhuizen wrote:
Otherwise we read past the end of the buffer.
Signed-off-by: Bas Nieuwenhuizen
---
src/amd/common/ac_debug.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/src/amd/common/ac_debug.c b/src/amd/common/ac_debug.c
ind
On Mon, 2017-01-16 at 10:21 -0800, Jason Ekstrand wrote:
> +curro
>
> I'm not sure what I think here. TBH, I haven't actually read it in detail
> yet, but here are some first impressions:
> 1) There are two implementations of atan2 (SPIR-V and GLSL) and they should
> be kept in sync. The same
I'm fine with this variant as well, so patches 1-2:
Reviewed-by: Nicolai Hähnle
On 17.01.2017 04:28, Ilia Mirkin wrote:
Signed-off-by: Ilia Mirkin
---
src/gallium/docs/source/screen.rst | 2 ++
src/gallium/drivers/freedreno/freedreno_screen.c | 1 +
src/gallium/drivers/i915/i9
On Tue, 2017-01-17 at 11:34 +0100, Juan A. Suarez Romero wrote:
> > The above does not necessarily sum to "we shouldn't fix it" but it probably
> > does mean it's low-priority at best and we need to be careful.
> >
> > Looking a bit into the math of atan2, I'm not convinced flushing denorms to
>
From: Marek Olšák
Cc: 17.0 13.0
---
src/gallium/drivers/radeonsi/si_shader.c | 7 +--
1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/src/gallium/drivers/radeonsi/si_shader.c
b/src/gallium/drivers/radeonsi/si_shader.c
index f404273..10f40a9 100644
--- a/src/gallium/drivers/ra
Reviewed-by: Edward O'Callaghan
On 01/17/2017 11:49 PM, Marek Olšák wrote:
> From: Marek Olšák
>
> Cc: 17.0 13.0
> ---
> src/gallium/drivers/radeonsi/si_shader.c | 7 +--
> 1 file changed, 5 insertions(+), 2 deletions(-)
>
> diff --git a/src/gallium/drivers/radeonsi/si_shader.c
> b/src/
On 16 January 2017 at 19:50, Ilia Mirkin wrote:
> What's the problem with using GALLIUM_DRIVER=softpipe /
> GALLIUM_DRIVER=llvmpipe to select between them?
>
Will work, but has the "update existing users" drawback.
Afaict neither scons nor autotools uses such approach, so I'd rather
keep things sy
On 16 January 2017 at 23:24, Mauro Rossi wrote:
>>> --- a/src/gallium/Android.mk
>>> +++ b/src/gallium/Android.mk
>>> @@ -34,7 +34,9 @@ SUBDIRS += auxiliary/pipe-loader
>>> #
>>>
>>> # swrast
>>> -ifneq ($(filter swrast,$(MESA_GPU_DRIVERS)),)
>>> +ifneq ($(filter llvmpipe,$(MESA_GPU_DRIVERS)),)
Hi Nayan,
I've pushed this patch yesterday and this one just a minute ago.
Christian.
Am 16.01.2017 um 14:19 schrieb Nayan Deshmukh:
Hi Christian,
Please push this patch.
There are a couple of patches [1] which are not yet reviewed. They are
trivial and are tested by Andy. Please have a look
On 16 January 2017 at 14:44, Jose Fonseca wrote:
> On 16/01/17 13:46, Emil Velikov wrote:
>>
>> On 14 January 2017 at 08:46, Jose Fonseca wrote:
>>>
>>> I suspect this might break builds with LLVM 3.6 or higher.
>>>
>>> The LLVMLinkInJIT must be inside #if ... #endif, and it must not be
>>> expan
Hi Ben,
On 16 January 2017 at 23:31, Ben Crocker wrote:
> If llvm::sys::getHostCPUName() returns "generic", override
> it with "pwr8" (on PPC64LE).
>
> This is a work-around for a bug in LLVM: a table entry for "POWER8NVL"
> is missing, resulting in (big-endian) "generic" being returned on
> litt
st will only add back buffer attachment to window framebuffer when
visual is in double-buffer mode. However, some applications may
render to front buffer even if they have chosen a double-buffer visual.
In this case, no color buffer will be attached when rendering. i965
handles this case correctly,
On 16 January 2017 at 23:38, Ben Crocker wrote:
> Reenable the PPC64LE Vector-Scalar Extension for LLVM versions >= 3.8.1,
> now that LLVM bug 26775 and its corollary, 25503, are fixed.
>
> Amendment: remove extraneous spaces in macro def & invocations.
>
> Signed-off-by: Ben Crocker
Stable ?
Cc
Reviewed-by: Nicolai Hähnle
On 17.01.2017 13:49, Marek Olšák wrote:
From: Marek Olšák
Cc: 17.0 13.0
---
src/gallium/drivers/radeonsi/si_shader.c | 7 +--
1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/src/gallium/drivers/radeonsi/si_shader.c
b/src/gallium/drivers/radeons
This patch is:
Reviewed-by: Iago Toral Quiroga
On Mon, 2017-01-16 at 17:20 +0100, Nicolai Hähnle wrote:
> From: Nicolai Hähnle
>
> ---
> src/compiler/glsl/ir_optimization.h | 4 +++-
> src/compiler/glsl/lower_instructions.cpp | 19 +++
> 2 files changed, 14 insertions(+),
On Tue, Jan 17, 2017 at 6:25 PM, Christian König
wrote:
> Hi Nayan,
>
> I've pushed this patch yesterday and this one just a minute ago.
>
Thanks for the push. I am planning on sending a similar patch for vaapi.
Regards,
Nayan
___
mesa-dev mailing list
Hello list,
The candidate for the Mesa 12.0.5 is now available. Currently we have:
- 38 queued
- 0 nominated (outstanding)
- and 0 rejected patches
Take a look at section "Mesa stable queue" for more information.
Note: This is an extra release for the 12.0 stable branch, as per developers'
fe
https://bugs.freedesktop.org/show_bug.cgi?id=92634
Nayan Deshmukh changed:
What|Removed |Added
Resolution|--- |FIXED
Status|NEW
On 17 January 2017 at 14:55, Nayan Deshmukh wrote:
> On Tue, Jan 17, 2017 at 6:25 PM, Christian König
> wrote:
>> Hi Nayan,
>>
>> I've pushed this patch yesterday and this one just a minute ago.
>>
> Thanks for the push. I am planning on sending a similar patch for vaapi.
>
If this (and the vaapi
On 16 January 2017 at 11:21, Nicolai Hähnle wrote:
> Emil, I'm going to follow up with a patch to try to fix the reported i915
> regression, but feel free to drop the patch from this thread from
> mesa-stable entirely. It's not an important fix.
>
Thank for the confirmation Nicolai !
-Emil
__
On Tue, Jan 17, 2017 at 12:21 AM, Pohjolainen, Topi <
topi.pohjolai...@gmail.com> wrote:
> On Mon, Jan 16, 2017 at 08:33:09AM -0800, Jason Ekstrand wrote:
> >On Mon, Jan 16, 2017 at 1:13 AM, Topi Pohjolainen
> ><[1]topi.pohjolai...@gmail.com> wrote:
> >
> > There exact same check earl
On Mon, Jan 16, 2017 at 11:59 PM, Pohjolainen, Topi <
topi.pohjolai...@gmail.com> wrote:
> On Mon, Jan 16, 2017 at 09:13:59AM -0800, Jason Ekstrand wrote:
> >On Mon, Jan 16, 2017 at 1:13 AM, Topi Pohjolainen
> ><[1]topi.pohjolai...@gmail.com> wrote:
> >
> > Signed-off-by: Topi Pohjola
On Tue, Jan 17, 2017 at 1:24 AM, Juan A. Suarez Romero
wrote:
> Commit 42011be1e disabled HiZ when sharing depth buffer externally,
> which free HiZ buffer.
>
> But in emit_depth_packets() we use that buffer, which generates a crash
> in
> "piglit.spec.egl_khr_gl_image.egl_khr_gl_renderbuffer_ima
On 01/16/2017 02:32 PM, Adam Jackson wrote:
On Thu, 2017-01-05 at 14:29 -0700, Kyle Brenneman wrote:
---
src/egl/generate/eglFunctionList.py | 6 --
1 file changed, 4 insertions(+), 2 deletions(-)
Reviewed-by: Adam Jackson
Is this too invasive for 13.1?
- ajax
If it helps, almost all
On Tue, Jan 17, 2017 at 9:12 PM, Emil Velikov wrote:
> On 17 January 2017 at 14:55, Nayan Deshmukh wrote:
>> On Tue, Jan 17, 2017 at 6:25 PM, Christian König
>> wrote:
>>> Hi Nayan,
>>>
>>> I've pushed this patch yesterday and this one just a minute ago.
>>>
>> Thanks for the push. I am planning
On 16 January 2017 at 14:16, Bas Nieuwenhuizen wrote:
> On Mon, Jan 16, 2017 at 2:51 PM, Emil Velikov
> wrote:
>> On 14 January 2017 at 02:31, Andres Rodriguez wrote:
>>> On Fri, Jan 13, 2017 at 8:13 PM, Emil Velikov
>>> wrote:
On 13 January 2017 at 23:44, Andres Rodriguez wrote:
>>
On 01/17/2017 06:44 AM, Boyan Ding wrote:
st will only add back buffer attachment to window framebuffer when
visual is in double-buffer mode. However, some applications may
render to front buffer even if they have chosen a double-buffer visual.
In this case, no color buffer will be attached when
Kenneth Graunke writes:
> On Saturday, January 14, 2017 11:09:53 PM PST Francisco Jerez wrote:
>> Hi Ken!
>>
>> Kenneth Graunke writes:
>>
>> > In theory we might have incorrectly NOP'd instructions that write the
>> > flag, but where that flag value isn't used, and yet the instruction
>> > ei
Blits do not need any special treatment as the target buffer
object is added to render cache just as one does for normal draw.
Color clears and resolves in turn require explicit "end of pipe
synchronization". It is not clear what this means exactly but the
assumption is that render cache flush with
instead of calling unconditionally brw_emit_mi_flush() which
does:
brw_emit_pipe_control_flush(brw,
PIPE_CONTROL_DEPTH_CACHE_FLUSH |
PIPE_CONTROL_RENDER_TARGET_FLUSH |
PIPE_CONTROL_CS_STALL);
brw
Blorp blits and clears use heavy hammer before and after each
operation. This is pipe control with almost all the bits set
simultaneously:
PIPE_CONTROL_RENDER_TARGET_FLUSH
PIPE_CONTROL_INSTRUCTION_INVALIDATE
PIPE_CONTROL_CONST_CACHE_INVALIDATE
PIPE_CONTROL_DEPTH_CACHE_FLUSH
PIPE_CON
Current blorp logic issues unconditional "flush everything"
(see brw_emit_mi_flush()) after each render. For example, all
blits issue this unconditionally which shouldn't be needed if
they set render cache properly os that subsequent renders do
necessary flushing before drawing.
In case of piglit:
by replacing brw_emit_mi_flush() with brw_render_cache_set_check_flush().
The latter splits the flush in two:
brw_emit_pipe_control_flush(brw,
PIPE_CONTROL_DEPTH_CACHE_FLUSH |
PIPE_CONTROL_RENDER_TARGET_FLUSH |
Eh... can you just do the release per the schedule? We don't block
releases for features, especially for very old platforms. There will be
another release soon enough.
On 01/17/2017 07:07 AM, Emil Velikov wrote:
> Hi all,
>
> As some of you may know the Intel and Igalia devs are working hard o
Kenneth Graunke writes:
> (Co-authored by Matt Turner.)
>
> Image atomics, for example, return a value - but the shader may not
> want to use it. We assigned a useless VGRF destination. This seemed
> harmless, but it can actually be quite harmful. The register allocator
> has to assign that VG
From: Dave Airlie
This fixes some issues we'd hit later if using viewport
indexes.
Signed-off-by: Dave Airlie
---
src/amd/vulkan/radv_cmd_buffer.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/src/amd/vulkan/radv_cmd_buffer.c b/src/amd/vulkan/radv_cmd_buffer.c
index 27fa405..c6f238b 10
Typo in the subject line "an scalar".
Samuel Iglesias Gonsálvez writes:
> Then the SIMD lowering pass will get rid of any compressed instructions with
> scalar
> source (whether force_writemask_all or not) and we avoid hitting the Gen7
> region
> decompression bug.
>
> Signed-off-by: Samuel Ig
Samuel Iglesias Gonsálvez writes:
> From: Iago Toral Quiroga
>
> 4-wide DF operations where NibCtrl applies require and execsize of 8
> in IvyBridge/BayTrail.
>
> v2:
> - Refactor NibCtrl printing (Matt)
>
> Reviewed-by: Matt Turner
Reviewed-by: Francisco Jerez
> ---
> src/mesa/drivers/dri/
On Tuesday, January 17, 2017 11:40:33 AM PST Francisco Jerez wrote:
> Kenneth Graunke writes:
>
> > (Co-authored by Matt Turner.)
> >
> > Image atomics, for example, return a value - but the shader may not
> > want to use it. We assigned a useless VGRF destination. This seemed
> > harmless, but
Samuel Iglesias Gonsálvez writes:
> From: "Juan A. Suarez Romero"
>
> The execution data size is the biggest type size of any instruction
> operand.
>
> We will use it to know if the instruction deals with DF, because in Ivy
> we need to double the execution size and regioning parameters.
>
> v2
Reviewed-by: Bas Nieuwenhuizen
On Tue, Jan 17, 2017 at 9:27 PM, Dave Airlie wrote:
> From: Dave Airlie
>
> This fixes some issues we'd hit later if using viewport
> indexes.
>
> Signed-off-by: Dave Airlie
> ---
> src/amd/vulkan/radv_cmd_buffer.c | 2 ++
> 1 file changed, 2 insertions(+)
>
> d
https://bugs.freedesktop.org/show_bug.cgi?id=99442
Bug ID: 99442
Summary: The Talos Principle hangs with Vulkan/Radeon
Product: Mesa
Version: git
Hardware: Other
OS: Linux (All)
Status: NEW
Severity: norma
https://bugs.freedesktop.org/show_bug.cgi?id=99442
--- Comment #1 from Hadrien ---
Created attachment 129011
--> https://bugs.freedesktop.org/attachment.cgi?id=129011&action=edit
vulkaninfo output
--
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee
https://bugs.freedesktop.org/show_bug.cgi?id=99442
--- Comment #2 from Hadrien ---
Created attachment 129012
--> https://bugs.freedesktop.org/attachment.cgi?id=129012&action=edit
gdb bt output
--
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for
This number was chosen in an attempt to match the limits applied to
GLSL IR.
A recent attempt to disable the GLSL IR optimisation loop in the i965
backend resulted in 4 loops from The Talos Principle failing to unroll.
Bumping the limit allows them to unroll which results in the instruction
count
Samuel Iglesias Gonsálvez writes:
> From: "Juan A. Suarez Romero"
>
> When converting a DF to F, we set dst stride to 2, to fulfil alignment
> restrictions.
>
> But in IVB/BYT, this is not necessary, as each DF conversion already
> writes 2 F, the first one the real value, and the second one a 0
Samuel Iglesias Gonsálvez writes:
> From: Matt Turner
>
> On HSW+, scalar DF sources can be accessed using the normal <0,1,0>
> region, but on IVB and BYT DF regions must be programmed in terms of
> floats. A <0,2,1> region accomplishes this.
Any reason you're doing this here twice instead of d
Samuel Iglesias Gonsálvez writes:
> From: "Juan A. Suarez Romero"
>
> In IVB and BYT, both regioning parameters and execution sizes are measured as
> floats.
>
> So when we have something like:
>
> mov(8) g2<1>DF g3<4,4,1>DF
>
> We are not actually moving 8 doubles (our intention), but 4 doubles
Samuel Iglesias Gonsálvez writes:
> From: "Juan A. Suarez Romero"
>
> Previous to Broadwell, we have 8 registers for MOV_INDIRECT.
>
> According to the IVB and HSW PRMs:
>
> "2.When the destination requires two registers and the sources are
> indirect, the sources must use 1x1 regioning mode. I
On Tue, Jan 17, 2017 at 11:40 AM, Francisco Jerez wrote:
> Kenneth Graunke writes:
>
>> (Co-authored by Matt Turner.)
>>
>> Image atomics, for example, return a value - but the shader may not
>> want to use it. We assigned a useless VGRF destination. This seemed
>> harmless, but it can actually
On Tue, Jan 17, 2017 at 1:01 PM, Timothy Arceri
wrote:
> This number was chosen in an attempt to match the limits applied to
> GLSL IR.
>
> A recent attempt to disable the GLSL IR optimisation loop in the i965
> backend resulted in 4 loops from The Talos Principle failing to unroll.
> Bumping the
Matt Turner writes:
> On Tue, Jan 17, 2017 at 11:40 AM, Francisco Jerez
> wrote:
>> Kenneth Graunke writes:
>>
>>> (Co-authored by Matt Turner.)
>>>
>>> Image atomics, for example, return a value - but the shader may not
>>> want to use it. We assigned a useless VGRF destination. This seemed
From: Marek Olšák
---
src/gallium/drivers/radeonsi/si_blit.c| 16 +---
src/gallium/drivers/radeonsi/si_descriptors.c | 21 +
src/gallium/drivers/radeonsi/si_pipe.h| 1 +
3 files changed, 31 insertions(+), 7 deletions(-)
diff --git a/src/gallium/d
From: Marek Olšák
---
src/gallium/drivers/radeonsi/si_descriptors.c | 17 +
1 file changed, 9 insertions(+), 8 deletions(-)
diff --git a/src/gallium/drivers/radeonsi/si_descriptors.c
b/src/gallium/drivers/radeonsi/si_descriptors.c
index fa3eaad..837f393 100644
--- a/src/gallium
From: Marek Olšák
the mutex lock is inside util_range_add.
---
src/gallium/drivers/radeonsi/si_cp_dma.c | 12 +++-
1 file changed, 7 insertions(+), 5 deletions(-)
diff --git a/src/gallium/drivers/radeonsi/si_cp_dma.c
b/src/gallium/drivers/radeonsi/si_cp_dma.c
index 06e4899..582e599 100
From: Marek Olšák
---
src/gallium/drivers/radeonsi/si_shader.h| 20 ++--
src/gallium/drivers/radeonsi/si_state_shaders.c | 16
2 files changed, 22 insertions(+), 14 deletions(-)
diff --git a/src/gallium/drivers/radeonsi/si_shader.h
b/src/gallium/drivers
From: Marek Olšák
If the shader selector is created with a different context than
the shader variant, we should use the calling context's target machine
for the shader variant.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99419
---
src/gallium/drivers/radeonsi/si_shader.h| 2
From: Marek Olšák
---
src/gallium/drivers/radeonsi/si_descriptors.c | 5 -
src/gallium/drivers/radeonsi/si_state.c | 6 ++
2 files changed, 6 insertions(+), 5 deletions(-)
diff --git a/src/gallium/drivers/radeonsi/si_descriptors.c
b/src/gallium/drivers/radeonsi/si_descriptors.c
i
From: Marek Olšák
the next commit will use it in a clever way, because the CP DMA prefetch
doesn't need this.
---
src/gallium/drivers/radeonsi/si_cp_dma.c | 24 ++--
src/gallium/drivers/radeonsi/si_pipe.h | 1 +
2 files changed, 15 insertions(+), 10 deletions(-)
diff --gi
From: Marek Olšák
---
src/gallium/drivers/radeon/r600_texture.c | 5 -
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/src/gallium/drivers/radeon/r600_texture.c
b/src/gallium/drivers/radeon/r600_texture.c
index cba4e7d..971e40a 100644
--- a/src/gallium/drivers/radeon/r600_textu
From: Marek Olšák
---
src/gallium/drivers/radeonsi/si_descriptors.c | 8
src/gallium/drivers/radeonsi/si_state.c | 6 ++
src/gallium/drivers/radeonsi/si_state.h | 1 +
3 files changed, 11 insertions(+), 4 deletions(-)
diff --git a/src/gallium/drivers/radeonsi/si_descrip
From: Marek Olšák
---
src/gallium/drivers/radeonsi/si_cp_dma.c | 12 +++-
src/gallium/drivers/radeonsi/si_pipe.h | 5 +
2 files changed, 12 insertions(+), 5 deletions(-)
diff --git a/src/gallium/drivers/radeonsi/si_cp_dma.c
b/src/gallium/drivers/radeonsi/si_cp_dma.c
index 4c79df
From: Marek Olšák
Only vertex buffers use a separate bool flag.
---
src/gallium/drivers/radeonsi/si_descriptors.c | 85 +++
src/gallium/drivers/radeonsi/si_pipe.h| 2 +
src/gallium/drivers/radeonsi/si_state.h | 2 -
src/gallium/drivers/radeonsi/si_state_dr
Previously the last stage would not get optimised until the backend did
its GLSL IR opt loop. I plan on removing that from i965 shortly which
caused huge regressions in Deus-ex and Tomb Raider which have large
constant arrays. Moving lowering before the opt loop in the GLSL linker
fixes this and un
On Tue, Jan 17, 2017 at 2:38 PM, Francisco Jerez wrote:
> Matt Turner writes:
>
>> On Tue, Jan 17, 2017 at 11:40 AM, Francisco Jerez
>> wrote:
>>> Kenneth Graunke writes:
>>>
(Co-authored by Matt Turner.)
Image atomics, for example, return a value - but the shader may not
w
On Wed, 2017-01-18 at 09:50 +1100, Timothy Arceri wrote:
> Previously the last stage would not get optimised until the backend
> did
> its GLSL IR opt loop.
Wait that would be all stages would not get optimised until the backend
called the glsl ir opts. Forgot we worked on a stage at a time for a
Matt Turner writes:
> On Tue, Jan 17, 2017 at 2:38 PM, Francisco Jerez
> wrote:
>> Matt Turner writes:
>>
>>> On Tue, Jan 17, 2017 at 11:40 AM, Francisco Jerez
>>> wrote:
Kenneth Graunke writes:
> (Co-authored by Matt Turner.)
>
> Image atomics, for example, return a v
"Juan A. Suarez Romero" writes:
> On Tue, 2017-01-17 at 11:34 +0100, Juan A. Suarez Romero wrote:
>> > The above does not necessarily sum to "we shouldn't fix it" but it
>> > probably does mean it's low-priority at best and we need to be careful.
>> >
>> > Looking a bit into the math of atan2,
On Tue, Jan 17, 2017 at 10:48 AM, Topi Pohjolainen <
topi.pohjolai...@gmail.com> wrote:
> Current blorp logic issues unconditional "flush everything"
> (see brw_emit_mi_flush()) after each render. For example, all
> blits issue this unconditionally which shouldn't be needed if
> they set render ca
I think the commit message could use some work. How about:
i965/blorp: Use the render cache mechanism instead of explicit flusing
On Tue, Jan 17, 2017 at 10:48 AM, Topi Pohjolainen <
topi.pohjolai...@gmail.com> wrote:
> by replacing brw_emit_mi_flush() with brw_render_cache_set_check_flush().
>
Francisco Jerez writes:
> Samuel Iglesias Gonsálvez writes:
>
>> From: "Juan A. Suarez Romero"
>>
>> When converting a DF to F, we set dst stride to 2, to fulfil alignment
>> restrictions.
>>
>> But in IVB/BYT, this is not necessary, as each DF conversion already
>> writes 2 F, the first one th
On Tuesday, January 17, 2017 8:48:24 PM PST Topi Pohjolainen wrote:
> Blits do not need any special treatment as the target buffer
> object is added to render cache just as one does for normal draw.
> Color clears and resolves in turn require explicit "end of pipe
> synchronization". It is not clea
Samuel Iglesias Gonsálvez writes:
> The hardware applies the same channel enable signals to both halves of
> the compressed instruction which will be just wrong under non-uniform
> control flow. Fix this by splitting those instructions to SIMD4.
>
> Signed-off-by: Samuel Iglesias Gonsálvez
> ---
1 - 100 of 127 matches
Mail list logo