From: Jerome Glisse
The ucode we got for hawaii does not support 0x1000 special nop
packet type 3 and this leads to gpu reading invalid memory. As packet
type 2 still exist just use packet type 2.
Note this only partialy fix hawaii issues and some zbuffer tiling
issues are still present.
Ch
From: Jérôme Glisse
Trace buffer allow to dump a command buffer which is fully repliable
as a standalone c program. This make debuging lockup immensively
simpler. This patch only plug the core minimal stuff and is still
missing the more fancy aspect that are in r600g. It however already
proved us
From: Jerome Glisse
The gpu packet prefetcher hates the ugly big nop packet those leads
to prefetching some invalid memory in some case. Apparently hawaii
is particularly sensible to this.
Note this only partialy fix hawaii issues and some zbuffer tiling
issues are still present.
Signed-off-by:
From: Jerome Glisse
There is no reason anymore to load with RTLD_GLOBAL and for some driver
this even result in dlclose failing to unload leading to catastrophic
failure with swrast fallback.
Signed-off-by: Jérôme Glisse
---
src/glx/dri_common.c | 10 +-
1 file changed, 5 insertions(+)
From: Jerome Glisse
This move the tracing timeout and printing into winsys and add
an debug environement variable for it (R600_DEBUG=trace_cs).
Lot of file touched because of winsys API changes.
v2: Do not write lockup file if ib uniq id does not match last one
Signed-off-by: Jerome Glisse
--
From: Jerome Glisse
This move the tracing timeout and printing into winsys and add
an debug environement variable for it (R600_DEBUG=trace_cs).
Lot of file touched because of winsys API changes.
Signed-off-by: Jerome Glisse
---
src/gallium/drivers/r300/r300_context.c| 2 +-
src/g
From: Jerome Glisse
Most test pass, issue are with border color and swizzle.
Based on ircnick patch.
v2: Restaged commit hunk
Signed-off-by: Jerome Glisse
---
src/gallium/drivers/radeonsi/si_state.c | 71 -
src/gallium/drivers/radeonsi/sid.h | 7
2
From: Jerome Glisse
v2: Remove left over code
v3: Restage properly the commit so hunk of first one are not in
second one.
Signed-off-by: Jerome Glisse
---
src/gallium/drivers/radeonsi/r600_texture.c | 11 ++--
src/gallium/drivers/radeonsi/si_state.c | 81 +
From: Jerome Glisse
Most test pass, issue are with border color and swizzle.
Based on ircnick patch.
Signed-off-by: Jerome Glisse
---
src/gallium/drivers/radeonsi/si_state.c | 165 +---
src/gallium/drivers/radeonsi/sid.h | 7 ++
2 files changed, 96 insertion
From: Jerome Glisse
v2: Remove left over code
Signed-off-by: Jerome Glisse
---
src/gallium/drivers/radeonsi/r600_texture.c | 11 ---
1 file changed, 8 insertions(+), 3 deletions(-)
diff --git a/src/gallium/drivers/radeonsi/r600_texture.c
b/src/gallium/drivers/radeonsi/r600_texture.c
Rebase on top of lastest libdrm patch. With small modification to ddx you can
also
have tiled front buffer rendering. But again we need to wait next mesa release
before changing ddx to assume by default it is installed with a recent enough
mesa.
No regression, just new test that pass.
Cheers,
J
From: Jerome Glisse
Signed-off-by: Jerome Glisse
---
src/gallium/drivers/radeonsi/r600_texture.c | 4 +-
src/gallium/drivers/radeonsi/si_state.c | 83 +
2 files changed, 14 insertions(+), 73 deletions(-)
diff --git a/src/gallium/drivers/radeonsi/r600_texture.c
This is mesa match for 2d tiling, it's missing change to configure.ac
to require proper libdrm. Will respin once i know.
Cheers,
Jerome
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev
From: Jerome Glisse
Build time option, set RADEON_CS_DUMP_ON_LOCKUP to 1 in radeon_drm_cs.h to
enable it.
When enabled after each cs submission the code will try to detect lockup by
waiting on one of the buffer of the cs to become idle, after a timeout it
will consider that the cs triggered a lo
From: Jerome Glisse
Build time option, set RADEON_CS_DUMP_ON_LOCKUP to 1 in radeon_drm_cs.h to
enable it.
When enabled after each cs submission the code will try to detect lockup by
waiting on one of the buffer of the cs to become idle, after a timeout it
will consider that the cs triggered a lo
From: Jerome Glisse
Same as on r600, trace cs execution by writting cs offset after each
states, this allow to pin point lockup inside command stream and
narrow down the scope of lockup investigation.
v2: Use WRITE_DATA packet instead of WRITE_MEM
Signed-off-by: Jerome Glisse
---
src/gallium/
From: Jerome Glisse
Same as on r600, trace cs execution by writting cs offset after each
states, this allow to pin point lockup inside command stream and
narrow down the scope of lockup investigation.
Signed-off-by: Jerome Glisse
---
src/gallium/drivers/radeonsi/r600_hw_context.c | 58
From: Jerome Glisse
Some code calling the flush function gave a fence pointer that point
to an old fence and should be unreference to avoid leaking fence.
Candidate for 9.1
Signed-off-by: Jerome Glisse
---
src/gallium/drivers/r600/r600_pipe.c | 8 +---
src/gallium/drivers/radeonsi
From: Jerome Glisse
This work around disable hyperz if write to zbuffer is disabled. Somehow
using hyperz when not writting to the zbuffer trigger GPU lockup. See :
https://bugs.freedesktop.org/show_bug.cgi?id=60848
Candidate for 9.1
Signed-off-by: Jerome Glisse
---
src/gallium/drivers/r600/
From: Jerome Glisse
Seems that alpha test being enabled confuse the GPU on the order in
which it should perform the Z testing. So force the order programmed
throught db shader control.
v2: Only force z order when alpha test is enabled
v3: Update db shader when binding new dsa + spelling fix
Sig
From: Jerome Glisse
Seems that alpha test being enabled confuse the GPU on the order in
which it should perform the Z testing. So force the order programmed
throught db shader control.
v2: Only force z order when alpha test is enabled
Signed-off-by: Jerome Glisse
Reviewed-by: Marek Olšák
---
From: Jerome Glisse
Seems that alpha test being enabled confuse the GPU on the order in
which it should perform the Z testing. So force the order programmed
throught db shader control.
Signed-off-by: Jerome Glisse
---
src/gallium/drivers/r600/evergreen_state.c | 5 +
src/gallium/drivers/r6
From: Jerome Glisse
We are now seing cs that can go over the vram+gtt size to avoid
failing flush early cs that goes over 70% (gtt+vram) usage. 70%
is use to allow some fragmentation.
The idea is to compute a gross estimate of memory requirement of
each draw call. After each draw call, memory wi
From: Jerome Glisse
We are now seing cs that can go over the vram+gtt size to avoid
failing flush early cs that goes over 70% (gtt+vram) usage. 70%
is use to allow some fragmentation.
Signed-off-by: Jerome Glisse
---
src/gallium/drivers/r600/evergreen_state.c| 4
src/gallium/drivers/
From: Jerome Glisse
v2: Add virtual address to dma src/dst offset for cayman
Signed-off-by: Jerome Glisse
---
src/gallium/drivers/r600/evergreen_hw_context.c | 46 ++
src/gallium/drivers/r600/evergreen_state.c | 201
src/gallium/drivers/r600/evergreend.h
From: Jerome Glisse
Signed-off-by: Jerome Glisse
---
src/gallium/drivers/r600/evergreen_hw_context.c | 44 ++
src/gallium/drivers/r600/evergreen_state.c | 197
src/gallium/drivers/r600/evergreend.h | 15 ++
src/gallium/drivers/r600/r600.h
From: Jerome Glisse
Add ring support, you can create a cs for each ring. DMA ring is
bit special regarding relocation as you must emit as much relocation
as there is use of the buffer.
v2: - Improved comment on relocation changes
- Use a single thread to queue cs submittion this simplify dri
So design is mostly the same then previously. Few changes, first i use only
one thread to offload all cs submission wether gfx or dma. Reasons is that
using on thread for gfx and one for dma lead to more complex synchronization
with no gain ie when submitting gfx you would need to make sure previou
From: Jerome Glisse
Signed-off-by: Jerome Glisse
---
src/gallium/drivers/r600/evergreen_compute.c | 2 ++
src/gallium/drivers/r600/r600_hw_context.c | 1 +
src/gallium/drivers/r600/r600_pipe.c | 6 ++
src/gallium/drivers/r600/r600_pipe.h | 1 +
src/gallium/drivers/r600/r60
From: Jerome Glisse
Signed-off-by: Jerome Glisse
---
src/gallium/drivers/r300/r300_context.c | 2 +-
src/gallium/drivers/r600/r600_pipe.c | 2 +-
src/gallium/drivers/radeonsi/radeonsi_pipe.c | 2 +-
src/gallium/winsys/radeon/drm/radeon_drm_bo.c | 2 +-
sr
So first patch is the winsys change while second patch implement multi ring
in the r600g driver. It use a stack to keep track of the order into which
rings must be submited. If will only pop the necessary entry from the stack
depending on the current request.
I think this address all concern from
From: Jerome Glisse
The design is to take advantage of the fact that kernel will emit
semaphore when buffer is referenced by different ring. So the only
thing we need to enforce synchronization btw dma and gfx/compute
ring is to make sure that we never reference same bo at the same
time on the dm
From: Jerome Glisse
Upcoming async dma support rely on winsys knowing about GPU families.
Signed-off-by: Jerome Glisse
---
src/gallium/drivers/r300/r300_chipset.c | 57 +--
src/gallium/drivers/r300/r300_chipset.h | 27 --
src/gallium/drivers/r300/r300_emit.c
From: Jerome Glisse
Signed-off-by: Jerome Glisse
---
src/gallium/drivers/r600/r600_pipe.c | 18 +-
src/gallium/drivers/r600/r600_pipe.h | 2 +-
src/gallium/winsys/radeon/drm/radeon_drm_bo.c | 3 +--
src/gallium/winsys/radeon/drm/radeon_drm_cs.h | 2 +-
4 fil
From: Jerome Glisse
It's a build time option you need to set R600_TRACE_CS to 1 and it
will print to stderr all cs along as cs trace point value which
gave last offset into a cs process by the GPU.
Signed-off-by: Jerome Glisse
---
src/gallium/drivers/r600/r600_hw_context.c | 41 ++
From: Jerome Glisse
This force surface allocated from ddx to be consider as height
aligned on 8 and fix 1D->2D tiling transition that result from
this.
Signed-off-by: Jerome Glisse
---
src/gallium/drivers/r600/r600_texture.c | 12 +---
1 file changed, 9 insertions(+), 3 deletions(-)
d
From: Jerome Glisse
htile is used for HiZ and HiS support and fast Z/S clears.
This commit just adds the htile setup and Fast Z clear.
We don't take full advantage of HiS with that patch.
v2 really use fast clear, still random issue with some tiles
need to try more flush combination, fix dept
From: Jerome Glisse
htile is used for HiZ and HiS support and fast Z/S clears.
This commit just adds the htile setup and Fast Z clear.
We don't take full advantage of HiS with that patch.
v2 really use fast clear, still random issue with some tiles
need to try more flush combination, fix dept
From: Jerome Glisse
This bring r600g allmost inline with closed source driver when
it comes to flushing and synchronization pattern.
v2-v4: history lost somewhere in outer space
v5: Fix compute size of flushing, use define for flags, update
worst case cs size requirement for flush, treat rs7
So those were tested on everegreen (caicos, redwood, turks, barts) and on
rv740 and did not regress anything. I can't test other r6xx/r7xx as currently
mesa master trigger lockup on anything else than rv740.
I am gonna merge those by the end of this week.
Cheers,
Jerome
_
From: Jerome Glisse
htile is used for HiZ and HiS support and fast Z/S clears.
This commit just adds the htile setup and Fast Z clear.
We don't take full advantage of HiS with that patch.
v2 really use fast clear, still random issue with some tiles
need to try more flush combination, fix dept
From: Jerome Glisse
This bring r600g allmost inline with closed source driver when
it comes to flushing and synchronization pattern.
v2-v4: history lost somewhere in outer space
v5: Fix compute size of flushing, use define for flags, update
worst case cs size requirement for flush, treat rs7
Ok so this time it should be it. Following patch seems to behave properly.
I am still in process of checking again that they don't regress anything,
i should be done monday or tuesday. If there is no objection by them i
will commit them.
Note that you need kernel patch for those and that by defaul
From: Jerome Glisse
htile is used for HiZ and HiS support and fast Z/S clears.
This commit just adds the htile setup and Fast Z clear.
We don't take full advantage of HiS with that patch.
v2 really use fast clear, still random issue with some tiles
need to try more flush combination, fix dept
From: Jerome Glisse
This bring r600g allmost inline with closed source driver when
it comes to flushing and synchronization pattern.
Signed-off-by: Jerome Glisse
---
src/gallium/drivers/r600/evergreen_compute.c | 8 +-
.../drivers/r600/evergreen_compute_internal.c | 4 +-
src/ga
So i finally have something that doesn't seem to lockup (i run in loop several
things that used to lockup on various GPU over 24hour without a single lockup),
or regress anything. It's a bundle deal the first patch is needed for lockup
avoidance. Tested on :
rv610, rv635, rv670, rv710, rv730, rv740
From: Jerome Glisse
On r6xx/r7xx shader resource management need to make sure that the
shader does not goes over the gpr register limit. Each specific
asic has a maxmimum register that can be split btw shader stage.
For each stage the shader must not use more register than the
limit programmed.
From: Jerome Glisse
On r6xx/r7xx shader resource management need to make sure that the
shader does not goes over the gpr register limit. Each specific
asic has a maxmimum register that can be split btw shader stage.
For each stage the shader must not use more register than the
limit programmed.
From: Jerome Glisse
Previous command stream might have set any of the constant buffer
and the previous address might no longer be valid thus GPU might
preload constant from random invalid address and possibly triggering
lockup.
Signed-off-by: Jerome Glisse
---
src/gallium/drivers/r600/evergree
From: Jerome Glisse
To avoid GPU lockup registers must be emited in a specific order
(no kidding ...). This patch rework atom emission so order in which
atom are emited in respect to each other is always the same. We
don't have any informations on what is the correct order so order
will need to b
From: Jerome Glisse
To avoid GPU lockup registers must be emited in a specific order
(no kidding ...). This patch rework atom emission so order in which
atom are emited in respect to each other is always the same. We
don't have any informations on what is the correct order so order
will need to b
From: Jerome Glisse
Use atom for sampler state. Does not provide new functionality
or fix any bug. Just a step toward full atom base r600g.
v2: Split seamless on r6xx/r7xx into it's own atom. Make sure it's
emited after sampler and with a pipeline flush before otherwise
it does not take
From: Jerome Glisse
Use atom for sampler state. Does not provide new functionality
or fix any bug. Just a step toward full atom base r600g.
Signed-off-by: Jerome Glisse
---
src/gallium/drivers/r600/evergreen_hw_context.c | 117 -
src/gallium/drivers/r600/evergreen_state.c |
This patch atomize the sampler state. No regression on evergreen,
can't really check r6xx/r7xx as they all lockup for me with mesa
master and 3.5
Plan is to convert everything to atom and then predefine atom
emission order.
Cheers,
Jerome
___
mesa-dev
From: Jerome Glisse
Flushing and synchronization only need to happen at begining
and end of cs, and after each draw packet if necessary. This
patch is especialy needed for hyperz/htile feature.
v2: Separate evergreen and r6xx/r7xx flushing/syncing allow
easier specialization of each function
From: Jerome Glisse
htile is used for HiZ and HiS support and fast Z/S clears.
This commit just adds the htile setup and Fast Z clear.
We don't take full advantage of HiS with that patch.
v2 really use fast clear, still random issue with some tiles
need to try more flush combination, fix dept
So this patch serie add hyperz but does not enable it by default. I
think i addressed all comment in v9 for htile. I am also asking to
include the flushing rework as without it hyperz lockup with thing
such as gears.
So with both patch most application should be fine with hyperz, but
application t
From: Jerome Glisse
htile is used for HiZ and HiS support and fast Z/S clears.
This commit just adds the htile setup and Fast Z clear.
We don't take full advantage of HiS with that patch.
v2 really use fast clear, still random issue with some tiles
need to try more flush combination, fix dept
So this patch add hyperz but does not enable it. I have been working on
that for the last 7 month i just fail at not making it lockup. Same time
i would prefer having this code upstream so i don't have to rebase.
I try to match fglrx sync & flush pattern but that would basicly mean
rewritting the
From: Jerome Glisse
DUAL_EXPORT can be enabled on r6xx/r7xx when all CBs use 16-bit export
and there is no depth/stencil export.
Signed-off-by: Jerome Glisse
---
src/gallium/drivers/r600/r600_pipe.h |1 +
src/gallium/drivers/r600/r600_state.c| 45 -
From: Vadim Girlin
It seems DUAL_EXPORT on evergreen may be enabled when all CBs use 16-bit export
mode (EXPORT_4C_16BPC), also there should be at least one CB, and the PS
shouldn't export depth/stencil.
Signed-off-by: Vadim Girlin
---
src/gallium/drivers/r600/evergreen_state.c | 46 ++
From: Vadim Girlin
In some cases TGSI shader has more color outputs than the number of CBs,
so it seems we need to limit the number of color exports. This requires
different shader variants depending on the nr_cbufs, but on the other hand
we are doing less exports, which are very costly.
v2: fix
From: Jerome Glisse
z or stencil texture should not be created with the z/stencil
flags for surface creation as they are intended to be bound
as texture.
v2: remove broken code
Signed-off-by: Jerome Glisse
---
src/gallium/drivers/r600/r600_texture.c | 32 --
1 fi
From: Jerome Glisse
z or stencil texture should not be created with the z/stencil
flags for surface creation as they are intended to be bound
as texture.
Signed-off-by: Jerome Glisse
---
src/gallium/drivers/r600/r600_texture.c | 34 +-
1 files changed, 19 insertio
From: Jerome Glisse
Virtual address space put the userspace in charge of their GPU
address space. It's up to userspace to bind bo into the virtual
address space. Command stream can them be executed using the
IB_VM chunck.
This patch add support for this configuration. It doesn't remove
the 64K i
From: Jerome Glisse
Virtual address space put the userspace in charge of their GPU
address space. It's up to userspace to bind bo into the virtual
address space. Command stream can them be executed using the
IB_VM chunck.
This patch add support for this configuration. It doesn't remove
the 64K i
From: Jerome Glisse
Virtual address space put the userspace in charge of their GPU
address space. It's up to userspace to bind bo into the virtual
address space. Command stream can them be executed using the
IB_VM chunck.
This patch add support for this configuration. It doesn't remove
the 64K i
67 matches
Mail list logo