From: Jerome Glisse
Using 64bits fence sequence we can directly compare sequence
number to know if a fence is signaled or not. Thus the fence
list became useless, so does the fence lock that mainly
protected the fence list.
Things like ring.ready are no longer behind a lock, this should
be ok as
From: Jerome Glisse
With fence rework it's now easier to agressivly free idle bo
when there is no hole to satisfy current allocation request.
The hit of some cs ioctl to have to go through the sa bo list
and free them is minimal, it happens once in while and avoid
some fence waiting.
Signed-off-
From: Jerome Glisse
Signed-off-by: Jerome Glisse
---
drivers/gpu/drm/radeon/radeon_cs.c |5 +
1 files changed, 5 insertions(+), 0 deletions(-)
diff --git a/drivers/gpu/drm/radeon/radeon_cs.c
b/drivers/gpu/drm/radeon/radeon_cs.c
index 82f2e7b0..b3800cb 100644
--- a/drivers/gpu/drm/rade
From: Jerome Glisse
We need to sync with the GFX ring as ttm might have schedule bo move
on it and new command scheduled for other ring need to wait for bo
data to be in place.
Signed-off-by: Jerome Glisse
---
drivers/gpu/drm/radeon/radeon_cs.c | 12 ++--
include/drm/radeon_drm.h
From: Jerome Glisse
This add a per ring allocation management and load balance the
chunk of the temp buffer between each ring. A ring that often
fail to find a hole or worse have to wait for previous fence
will have more chance to grow over other ring. This ring is
properly CPU starve in a sense.
From: Jerome Glisse
It seems imac pannel doesn't like whe we change the hot plug setup
and then refuse to work. This should fix :
https://bugzilla.redhat.com/show_bug.cgi?id=726143
Signed-off-by: Matthew Garrett
Signed-off-by: Jerome Glisse
---
drivers/gpu/drm/radeon/r600.c |8
1
From: Jerome Glisse
It seems imac pannel doesn't like whe we change the hot plug setup
and then refuse to work. This help but doesn't fully fix:
https://bugzilla.redhat.com/show_bug.cgi?id=726143
v2: fix typo and improve commit message
Signed-off-by: Matthew Garrett
Signed-off-by: Jerome Gliss
From: Jerome Glisse
It seems imac pannel doesn't like whe we change the hot plug setup
and then refuse to work. This help but doesn't fully fix:
https://bugzilla.redhat.com/show_bug.cgi?id=726143
v2: fix typo and improve commit message
Signed-off-by: Matthew Garrett
Signed-off-by: Jerome Gliss
First chunk rework fence to use uin64_t, unlike previous patch,
we only emit the lower 32 bits with the hw. The upper 32bits is
handled in the fence process function where a lenghty comment
discuss all the possible things that can go wrong and why it
doesn't matter.
Then taking advantage of faster
From: Christian K?nig
A single global mutex for ring submissions seems sufficient.
Signed-off-by: Christian K?nig
---
drivers/gpu/drm/radeon/radeon.h |3 +-
drivers/gpu/drm/radeon/radeon_device.c|3 +-
drivers/gpu/drm/radeon/radeon_pm.c| 10 +-
drivers/gpu/d
From: Jerome Glisse
This convert fence to use uint64_t sequence number intention is
to use the fact that uin64_t is big enough that we don't need to
care about wrap around.
Tested with and without writeback using 0xF000 as initial
fence sequence and thus allowing to test the wrap around from
From: Jerome Glisse
Using 64bits fence sequence we can directly compare sequence
number to know if a fence is signaled or not. Thus the fence
list became useless, so does the fence lock that mainly
protected the fence list.
Things like ring.ready are no longer behind a lock, this should
be ok as
From: Christian K?nig
We are locking the ring emission mutex anyway, so
there is no harm in doing it a bit earlier and
prevent multiple resets to happen at the same time.
Signed-off-by: Christian K?nig
---
drivers/gpu/drm/radeon/radeon_fence.c | 10 +-
drivers/gpu/drm/radeon/radeon_r
From: Christian K?nig
Instead of hacking the calculation multiple times.
Signed-off-by: Christian K?nig
---
drivers/gpu/drm/radeon/radeon_gart.c |6 ++
drivers/gpu/drm/radeon/radeon_object.h| 11 +++
drivers/gpu/drm/radeon/radeon_ring.c |6 ++
drivers/gp
From: Christian K?nig
Make the suballocator self containing to locking.
v2: split the bugfix into a seperate patch.
v3: remove some unreleated changes.
Sig-off-by: Christian K?nig
---
drivers/gpu/drm/radeon/radeon.h|1 +
drivers/gpu/drm/radeon/radeon_sa.c |6 ++
2 files change
From: Christian K?nig
Dumping the current allocations.
Signed-off-by: Christian K?nig
---
drivers/gpu/drm/radeon/radeon_object.h |5 +
drivers/gpu/drm/radeon/radeon_ring.c | 22 ++
drivers/gpu/drm/radeon/radeon_sa.c | 14 ++
3 files changed, 41
From: Christian K?nig
Instead of offset + size keep start and end offset directly.
Signed-off-by: Christian K?nig
---
drivers/gpu/drm/radeon/radeon.h|4 ++--
drivers/gpu/drm/radeon/radeon_cs.c |4 ++--
drivers/gpu/drm/radeon/radeon_object.h |4 ++--
drivers/gpu/drm/rade
From: Christian K?nig
Allocating and freeing it seperately.
Signed-off-by: Christian K?nig
---
drivers/gpu/drm/radeon/radeon.h |4 ++--
drivers/gpu/drm/radeon/radeon_cs.c|4 ++--
drivers/gpu/drm/radeon/radeon_gart.c |4 ++--
drivers/gpu/drm/radeon/radeon_obje
From: Christian K?nig
Define the interface without modifying the allocation
algorithm in any way.
v2: rebase on top of fence new uint64 patch
Signed-off-by: Jerome Glisse
Signed-off-by: Christian K?nig
---
drivers/gpu/drm/radeon/radeon.h |1 +
drivers/gpu/drm/radeon/radeon_gart
From: Jerome Glisse
Use one wait queue for all rings. When one ring progress, other
likely does to and we are not expecting to have a lot of waiter
anyway.
Also add a fence_wait_any that will wait until the first fence
in the fence array (one fence per ring) is signaled. This allow
to wait on al
From: Christian K?nig
A startover with a new idea for a multiple ring allocator.
Should perform as well as a normal ring allocator as long
as only one ring does somthing, but falls back to a more
complex algorithm if more complex things start to happen.
We store the last allocated bo in last, we
From: Jerome Glisse
Directly use the suballocator to get small chunks of memory.
It's equally fast and doesn't crash when we encounter a GPU reset.
v2: rebased on new SA interface.
Signed-off-by: Christian K?nig
Signed-off-by: Jerome Glisse
---
drivers/gpu/drm/radeon/evergreen.c|
From: Jerome Glisse
It isn't necessary any more and the suballocator seems to perform
even better.
Signed-off-by: Christian K?nig
Signed-off-by: Jerome Glisse
---
drivers/gpu/drm/radeon/radeon.h | 17 +--
drivers/gpu/drm/radeon/radeon_device.c|1 -
drivers/gpu/drm/radeon/r
From: Christian K?nig
We can now protected the semaphore ram by a
fence, so free it immediately.
Signed-off-by: Christian K?nig
---
drivers/gpu/drm/radeon/radeon_ttm.c |7 +--
1 files changed, 5 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/radeon/radeon_ttm.c
b/drivers/
From: Jerome Glisse
It never really belonged there in the first place.
Signed-off-by: Christian K?nig
---
drivers/gpu/drm/radeon/radeon.h | 16
drivers/gpu/drm/radeon/radeon_cs.c|4 ++--
drivers/gpu/drm/radeon/radeon_fence.c |3 ---
drivers/gpu/drm/radeon/r
From: Christian K?nig
If we don't store local data into global variables
it isn't necessary to lock anything.
v2: rebased on new SA interface
Signed-off-by: Christian K?nig
---
drivers/gpu/drm/radeon/evergreen_blit_kms.c |1 -
drivers/gpu/drm/radeon/r600.c | 13 +---
drive
From: Jerome Glisse
No need to malloc it any more.
Signed-off-by: Jerome Glisse
Signed-off-by: Christian K?nig
---
drivers/gpu/drm/radeon/evergreen_cs.c | 10 +++---
drivers/gpu/drm/radeon/r100.c | 38 ++--
drivers/gpu/drm/radeon/r200.c |2 +-
drivers/g
Attached is 2 patch for dumping everything needed to replay faulty
command stream. I haven't add a module option in the radeon patch
but the idea would be to enable the dumping only if it's requested.
I know AMD folks would like to reuse AMD internal format, but unless
we can quickly get ACK to re
From: Jerome Glisse
Allow driver to provide a custom read callback for debugfs file.
Usefull if driver try to dump big buffer, avoid double buffering.
Signed-off-by: Jerome Glisse
---
drivers/gpu/drm/drm_debugfs.c | 19 ---
drivers/gpu/drm/i915/i915_debugfs.c
From: Jerome Glisse
This try to identify the faulty user command stream that caused
lockup. If it finds one it create big blob that contains all
information needed to replay the faulty command stream.
Signed-off-by: Jerome Glisse
---
drivers/gpu/drm/radeon/r100.c |6 +-
drivers/gp
So here is improved patchset, where i splited ground work necessary
for the dumping into their own patch. The debugfs improvement could
probably be usefull to intel instead of having i915 have it's own
debugfs file stuff.
The lockup dumping public api have been move into radeon_drm.h
Stressing th
From: Jerome Glisse
Allow driver to provide a custom read callback for debugfs file.
Usefull if driver try to dump big buffer, avoid double buffering.
Signed-off-by: Jerome Glisse
---
drivers/gpu/drm/drm_debugfs.c | 19 ---
drivers/gpu/drm/i915/i915_debugfs.c
From: Jerome Glisse
Allow radeon debugfs file to provide a custom read function. This
is usefull in case you don't want to double buffer with seq_file,
or simply in case the buffer data is too big to be buffered by
seq_file.
Signed-off-by: Jerome Glisse
---
drivers/gpu/drm/radeon/r100.c
From: Jerome Glisse
Allow caller of radeon_vm_bo_update_pte to get the virtual bo offset.
Signed-off-by: Jerome Glisse
---
drivers/gpu/drm/radeon/radeon.h |3 ++-
drivers/gpu/drm/radeon/radeon_cs.c |2 +-
drivers/gpu/drm/radeon/radeon_gart.c | 11 ---
3 files changed,
From: Jerome Glisse
This try to identify the faulty user command stream that caused
lockup. If it finds one it create big blob that contains all
information, this include packet stream but also snapshot of all
bo used by the faulty packet stream.
This means that the blod is self contained and ca
From: Jerome Glisse
Signed-off-by: Jerome Glisse
---
drivers/gpu/drm/radeon/radeon.h | 528 +-
drivers/gpu/drm/radeon/radeon_ring.c |3 +-
2 files changed, 267 insertions(+), 264 deletions(-)
diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/d
Make the format more future proof reliable by adding a total chunk
size field that allow old userspace to skip over potentialy new
chunk. Not sure this is really needed but hey.
Jerome
From: Jerome Glisse
Allow driver to provide a custom read callback for debugfs file.
Usefull if driver try to dump big buffer, avoid double buffering.
Signed-off-by: Jerome Glisse
---
drivers/gpu/drm/drm_debugfs.c | 19 ---
drivers/gpu/drm/i915/i915_debugfs.c
From: Jerome Glisse
Allow radeon debugfs file to provide a custom read function. This
is usefull in case you don't want to double buffer with seq_file,
or simply in case the buffer data is too big to be buffered by
seq_file.
Signed-off-by: Jerome Glisse
---
drivers/gpu/drm/radeon/r100.c
From: Jerome Glisse
Allow caller of radeon_vm_bo_update_pte to get the virtual bo offset.
Signed-off-by: Jerome Glisse
---
drivers/gpu/drm/radeon/radeon.h |3 ++-
drivers/gpu/drm/radeon/radeon_cs.c |2 +-
drivers/gpu/drm/radeon/radeon_gart.c | 11 ---
3 files changed,
From: Jerome Glisse
This try to identify the faulty user command stream that caused
lockup. If it finds one it create big blob that contains all
information, this include packet stream but also snapshot of all
bo used by the faulty packet stream.
This means that the blod is self contained and ca
From: Jerome Glisse
Signed-off-by: Jerome Glisse
---
drivers/gpu/drm/radeon/radeon.h | 528 +-
drivers/gpu/drm/radeon/radeon_ring.c |3 +-
2 files changed, 267 insertions(+), 264 deletions(-)
diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/d
Ok this time is final version, i added a bunch of flags to cmd buffer
to make the userspace tools life easier.
Cheers,
Jerome
From: Jerome Glisse
Allow driver to provide a custom read callback for debugfs file.
Usefull if driver try to dump big buffer, avoid double buffering.
Signed-off-by: Jerome Glisse
---
drivers/gpu/drm/drm_debugfs.c | 19 ---
drivers/gpu/drm/i915/i915_debugfs.c
From: Jerome Glisse
Allow radeon debugfs file to provide a custom read function. This
is usefull in case you don't want to double buffer with seq_file,
or simply in case the buffer data is too big to be buffered by
seq_file.
Signed-off-by: Jerome Glisse
---
drivers/gpu/drm/radeon/r100.c
From: Jerome Glisse
Allow caller of radeon_vm_bo_update_pte to get the virtual bo offset.
Signed-off-by: Jerome Glisse
---
drivers/gpu/drm/radeon/radeon.h |3 ++-
drivers/gpu/drm/radeon/radeon_cs.c |2 +-
drivers/gpu/drm/radeon/radeon_gart.c | 11 ---
3 files changed,
From: Jerome Glisse
This try to identify the faulty user command stream that caused
lockup. If it finds one it create big blob that contains all
information, this include packet stream but also snapshot of all
bo used by the faulty packet stream.
This means that the blod is self contained and ca
From: Jerome Glisse
Signed-off-by: Jerome Glisse
---
drivers/gpu/drm/radeon/radeon.h | 528 +-
drivers/gpu/drm/radeon/radeon_ring.c |3 +-
2 files changed, 267 insertions(+), 264 deletions(-)
diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/d
From: Jerome Glisse
GPU with low amount of ram can fails at pining new framebuffer before
unpining old one. On such failure, retry with unping old one before
pining new one allowing to work around the issue. This is somewhat
ugly but only affect those old GPU we care about.
Signed-off-by: Jerome
From: Jérôme Glisse
Laptop with Turks/Thames GPU will freeze if dpm is enabled. It seems
the SMC engine is relying on some state inside the CP engine. CP needs
to chew at least one packet for it to get in good state for dynamic
power management.
This patch simply disabled and re-enable DPM aft
From: Jérôme Glisse
In order for hibernation to reliably work we need to cleanup more
thoroughly the compute ring. Hibernation is different from suspend
resume as when we resume from hibernation the hardware is first
fully initialize by regular kernel then freeze callback happens
(which corresp
From: Jérôme Glisse
In order for hibernation to reliably work we need to properly turn
off the SDMA block, sadly after numerous attemps i haven't not found
proper sequence for clean and full shutdown. So simply reset both
SDMA block, this makes hibernation works reliably on sea island GPU
famil
From: Jérôme Glisse
In order for hibernation to reliably work we need to cleanup more
thoroughly the compute ring. Hibernation is different from suspend
resume as when we resume from hibernation the hardware is first
fully initialize by regular kernel then freeze callback happens
(which corresp
From: Jérôme Glisse
In order for hibernation to reliably work we need to properly turn
off the SDMA block, sadly after numerous attemps i haven't not found
proper sequence for clean and full shutdown. So simply reset both
SDMA block, this makes hibernation works reliably on sea island GPU
famil
From: Jérôme Glisse
Current code never allowed the page pool to actualy fill in anyway.
This fix it, so that we only start freeing page from the pool when
we go over the pool size.
Signed-off-by: Jérôme Glisse
Reviewed-by: Mario Kleiner
Tested-by: Michel Dänzer
Cc: Thomas Hellstrom
Cc:
From: Jérôme Glisse
Calls to set_memory_wb() incure heavy TLB flush and IPI cost. To
minimize those wait until pool grow beyond batch size before
draining the pool.
Signed-off-by: Jérôme Glisse
Reviewed-by: Mario Kleiner
Cc: Michel Dänzer
Cc: Thomas Hellstrom
Cc: Konrad Rzeszutek Wilk
From: Jérôme Glisse
Current code never allowed the page pool to actualy fill in anyway.
This fix it, so that we only start freeing page from the pool when
we go over the pool size.
Changed since v1:
- Move the page batching optimization to its separate patch.
Changed since v2:
- Do not re
From: Jérôme Glisse
Calls to set_memory_wb() incure heavy TLB flush and IPI cost. To
minimize those wait until pool grow beyond batch size before
draining the pool.
Signed-off-by: Jérôme Glisse
Reviewed-by: Mario Kleiner
Reviewed-and-Tested-by: Michel Dänzer
Reviewed-by: Konrad Rzeszutek
From: Jerome Glisse
Avoid creating temporary platform device that will lead to issue
when several radeon gpu are in same computer. Instead directly use
the radeon device for requesting firmware.
Signed-off-by: Jerome Glisse
---
drivers/gpu/drm/radeon/cik.c| 25 +++--
From: Jerome Glisse
If a buffer is never bind to a virtual memory pagetable than don't try
to unbind it. Only drawback is that we don't update the pagetable when
unbinding the ib pool buffer which is fine because it only happens at
suspend or module unload/shutdown.
Cc: stable at kernel.org
Sign
From: Jerome Glisse
UVD ring can't use scratch thus it does need writeback buffer to keep
a valid address or radeon_ring_backup will trigger a kernel fault.
It's ok to not unpin the write back buffer on suspend as it leave in
gtt and thus does not need eviction.
Reported and tracked by Wojtek
From: Jerome Glisse
There might be issue with lockup detection when scheduling on an
empty ring that have been sitting idle for a while. Thus update
the lockup tracking data when scheduling new work in an empty ring.
Signed-off-by: Jerome Glisse
Tested-by: Andy Lutomirski
Cc: stable at vger.ke
301 - 362 of 362 matches
Mail list logo