Re: [PATCH 2/2] drm: Move drm_gem ioctl kerneldoc to uapi file

2025-07-18 Thread Christian König
On 14.07.25 14:40, Simona Vetter wrote: > On Mon, Jul 14, 2025 at 11:50:32AM +0200, Christian König wrote: >> On 11.07.25 23:55, Simona Vetter wrote: >>> On Fri, Jul 11, 2025 at 10:53:42AM -0400, David Francis wrote: >>>> The drm_gem ioctls were documented in intern

Re: [PATCH] drm/amdgpu: Raven: don't allow mixing GTT and VRAM

2025-07-17 Thread Christian König
me more problematic side effects (drawing more power etc...) > It would seem that all devices > would have this issue, no? Also, I'm not familiar with how > kms_plane_alpha_blend works, but does this also support that test > failing as the cause? Correct, it affects all APUs which

Switching over to GEM refcounts and a bunch of cleanups

2025-07-16 Thread Christian König
Hi guys, so I hope Thomas is back from vacation while I will be on vacation for the next two weeks. Here is the patch set which cleans up TTM and XE in preperation of switching to drm_exec. Please take a look and let me know what you think about it. Regards, Christian.

[PATCH 7/7] drm/xe: remove workaround for TTM internals

2025-07-16 Thread Christian König
This should no longer be necessary, TTM doesn't lock the BO without a reference any more. Only compile tested! Signed-off-by: Christian König --- drivers/gpu/drm/xe/xe_bo.c | 32 +--- 1 file changed, 5 insertions(+), 27 deletions(-) diff --git a/drivers/gpu/d

[PATCH 5/7] drm/ttm: move zombie handling into ttm_bo_evict

2025-07-16 Thread Christian König
Both callers do the same thing, so we can trivially unify that. Signed-off-by: Christian König --- drivers/gpu/drm/ttm/ttm_bo.c | 24 +--- 1 file changed, 9 insertions(+), 15 deletions(-) diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c index

[PATCH 1/7] drm/ttm: replace TTMs refcount with the DRM refcount v3

2025-07-16 Thread Christian König
re-enable disabled test v3: handle another case in i915 Signed-off-by: Christian König --- drivers/gpu/drm/i915/gem/i915_gem_ttm.c | 35 ++--- .../gpu/drm/ttm/tests/ttm_bo_validate_test.c | 8 +- drivers/gpu/drm/ttm/tests/ttm_kunit_helpers.c | 2 - drivers/gpu/drm/ttm/ttm_bo.c

[PATCH 4/7] drm/ttm: switch to ttm_bo_lru_for_each_reserved_guarded for swapout

2025-07-16 Thread Christian König
Instead of the walker wrapper use the underlying foreach. Saves us quite a bunch of complexity and loc. Signed-off-by: Christian König --- drivers/gpu/drm/ttm/ttm_bo.c | 57 ++-- drivers/gpu/drm/ttm/ttm_device.c | 19 --- include/drm/ttm/ttm_bo.h

[PATCH 2/7] drm/ttm: remove ttm_lru_walk_ops

2025-07-16 Thread Christian König
It's just another layer of indirection. Signed-off-by: Christian König --- drivers/gpu/drm/ttm/ttm_bo.c | 12 ++- drivers/gpu/drm/ttm/ttm_bo_util.c | 2 +- include/drm/ttm/ttm_bo.h | 34 +-- 3 files changed, 17 insertions(+), 31 dele

[PATCH 6/7] drm/ttm: use ttm_bo_lru_for_each_reserved_guarded in evict_all

2025-07-16 Thread Christian König
Use the for_each loop to evict all BOs of an resource manager as well. Greately simplifying the handling and finally allows us to remove ttm_bo_evict_first(). Signed-off-by: Christian König --- drivers/gpu/drm/ttm/ttm_bo.c | 51 +- drivers/gpu/drm/ttm

[PATCH 3/7] drm/ttm: grab BO reference before locking it

2025-07-16 Thread Christian König
Previously we always grabbed the BO reference after taking the lock, but that isn't necessary any more. So avoid doing that and cleanup the handling here. Signed-off-by: Christian König --- drivers/gpu/drm/ttm/ttm_bo_util.c | 15 +-- 1 file changed, 9 insertions(+), 6 dele

[PATCH] drm/ttm: rename ttm_bo_put to _fini v2

2025-07-16 Thread Christian König
Give TTM BOs a separate cleanup function. No funktional change, but the next step in removing the TTM BO reference counting and replacing it with the GEM object reference counting. v2: move the code around a bit to make it clearer what's happening Signed-off-by: Christian König --- dr

Re: [PATCH v5 2/3] drm/amdgpu: Reset the clear flag in buddy during resume

2025-07-16 Thread Christian König
On 16.07.25 12:47, Christian König wrote: > On 16.07.25 12:28, Arunpravin Paneer Selvam wrote: >> Hi Dave, >> >> I am trying to push this series into drm-misc-fixes, but I get the below >> error when dim push-branch drm-misc-fixes. >> >> dim:ERROR:e24c180b4

Re: [PATCH] drm/sched: Remove optimization that causes hang when killing dependent jobs

2025-07-16 Thread Christian König
On 16.07.25 12:46, Philipp Stanner wrote: > +Cc Greg, Sasha > > On Wed, 2025-07-16 at 12:40 +0200, Michel Dänzer wrote: >> On 16.07.25 11:57, Philipp Stanner wrote: >>> On Wed, 2025-07-16 at 09:43 +, cao, lin wrote: Hi Philipp, Thank you for the review. I found that th

Re: [PATCH v5 2/3] drm/amdgpu: Reset the clear flag in buddy during resume

2025-07-16 Thread Christian König
w Auld) >>    - Having this function being able to flip the state either way would be >> good. (Matthew Brost) >> >> v3(Matthew Auld): >>    - Do merge step first to avoid the use of extra reset flag. >> >> Signed-off-by: Arunpravin Paneer Selvam &

Re: [PATCH] drm/sched: Remove optimization that causes hang when killing dependent jobs

2025-07-16 Thread Christian König
On 16.07.25 12:13, Danilo Krummrich wrote: > On Wed Jul 16, 2025 at 12:05 PM CEST, lin cao wrote: >> [AMD Official Use Only - AMD Internal Distribution Only] > > Two small off-topic remarks from my side. :) > > Can you please remove "AMD Official Use Only" header when sending to public > mailing

Re: [PATCH] drm/sched: Remove optimization that causes hang when killing dependent jobs

2025-07-16 Thread Christian König
On 16.07.25 11:43, cao, lin wrote: > [AMD Official Use Only - AMD Internal Distribution Only] > > > Hi Philipp, > > Thank you for the review. I found that this optimization was introduced 9 > years ago in commit 777dbd458c89d4ca74a659f85ffb5bc817f29a35 ("drm/amdgpu: > drop a dummy wakeup sched

Re: [PATCH v4 2/3] drm/amdgpu: Reset the clear flag in buddy during resume

2025-07-16 Thread Christian König
ng able to flip the state either way would be > good. (Matthew Brost) > > v3(Matthew Auld): > - Do merge step first to avoid the use of extra reset flag. You've lost me with that :) > > Signed-off-by: Arunpravin Paneer Selvam > Suggested-by: Christian König >

Re: [PATCH] drm/sched: Remove optimization that causes hang when killing dependent jobs

2025-07-15 Thread Christian König
ation B to hang. > > Remove the optimization by deleting drm_sched_entity_clear_dep() and its > usage, ensuring the scheduler is always woken up when dependencies are > cleared. > > Signed-off-by: Lin.Cao Reviewed-by: Christian König > --- > drivers/gpu/drm/scheduler/sched_

Re: [PATCH 6.15 085/192] drm/gem: Acquire references on GEM handles for framebuffers

2025-07-15 Thread Christian König
On 15.07.25 15:56, Simona Vetter wrote: > On Tue, Jul 15, 2025 at 03:43:08PM +0200, Christian König wrote: >> We are about to revert this patch. Not sure if backporting it makes sense at >> the moment. > > I think it still makes sense, at least as an interim fix. >

Re: [PATCH 6.15 085/192] drm/gem: Acquire references on GEM handles for framebuffers

2025-07-15 Thread Christian König
ect > instance") triggers the segmentation fault easily by using the dma-buf > field more widely. The underlying issue with reference counting has > been present before. > > v2: > - acquire the handle instead of the BO (Christian) > - fix comment style (Christian) > -

Re: [PATCH v2] drm/scheduler: Fix sched hang when killing app with dependent jobs

2025-07-15 Thread Christian König
On 15.07.25 14:20, Philipp Stanner wrote: > On Tue, 2025-07-15 at 12:52 +0200, Christian König wrote: >> On 15.07.25 12:27, Philipp Stanner wrote: >>> On Tue, 2025-07-15 at 09:51 +, cao, lin wrote: >>>> >>>> [AMD Official Use Only - AMD Internal

Re: [PATCH v2] drm/scheduler: Fix sched hang when killing app with dependent jobs

2025-07-15 Thread Christian König
ful) by me and if the tests succeed we can merge it. >> However, I'd feel better if you could clarify more why that function is >> the right place to solve the bug. >> >> >> Thanks, >> P. >> >> >>> >>> >>> Add drm_sched_wak

Re: [PATCH v2 0/7] drm: Revert general use of struct drm_gem_object.dma_buf

2025-07-15 Thread Christian König
If it helps Acked-by: Christian König for the entire series. Regards, Christian. On 15.07.25 10:07, Thomas Zimmermann wrote: > Revert the use of drm_gem_object.dma_buf back to .import_attach->dmabuf > in the affected places. Separates references to imported and exported DMA > bufs

Re: [PATCH 13/18] ttm/pool: enable memcg tracking and shrinker. (v2)

2025-07-15 Thread Christian König
On 14.07.25 07:18, Dave Airlie wrote: > From: Dave Airlie > > This enables all the backend code to use the list lru in memcg mode, > and set the shrinker to be memcg aware. > > It adds the loop case for when pooled pages end up being reparented > to a higher memcg group, that newer memcg can

Re: [PATCH v2] drm/scheduler: Fix sched hang when killing app with dependent jobs

2025-07-14 Thread Christian König
On 14.07.25 15:27, Philipp Stanner wrote: > On Mon, 2025-07-14 at 15:08 +0200, Christian König wrote: >> >> >> On 14.07.25 14:46, Philipp Stanner wrote: >>> regarding the patch subject: the prefix we use for the scheduler >>> is: >>> drm/sche

Re: [PATCH v2] drm/scheduler: Fix sched hang when killing app with dependent jobs

2025-07-14 Thread Christian König
On 14.07.25 14:46, Philipp Stanner wrote: > regarding the patch subject: the prefix we use for the scheduler is: > drm/sched: > > > On Mon, 2025-07-14 at 14:23 +0800, Lin.Cao wrote: > >> When Application A submits jobs (a1, a2, a3) and application B submits > > s/Application/application > >

Re: [PATCH 18/18] ttm: add support for a module option to disable memcg pool

2025-07-14 Thread Christian König
On 14.07.25 07:18, Dave Airlie wrote: > From: Dave Airlie > > There is an existing workload that cgroup support might regress, > the systems are setup to allocate 1GB of uncached pages at system > startup to prime the pool, then any further users will take them > from the pool. The current cgroup

Re: [PATCH v5 2/8] drm/sched: Allow drivers to skip the reset and keep on running

2025-07-14 Thread Christian König
On 14.07.25 12:16, Philipp Stanner wrote: > On Mon, 2025-07-14 at 11:23 +0200, Christian König wrote: >> On 13.07.25 21:03, Maíra Canal wrote: >>> Hi Christian, >>> >>> On 11/07/25 12:20, Christian König wrote: >>>> On 11.07.25 15:37, Philipp Stanne

Re: [PATCH v2] drm/scheduler: Fix sched hang when killing app with dependent jobs

2025-07-14 Thread Christian König
: > - Move drm_sched_wakeup() to after drm_sched_fence_scheduled() > > Signed-off-by: Lin.Cao Reviewed-by: Christian König > --- > drivers/gpu/drm/scheduler/sched_entity.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/drivers/gpu/drm/scheduler/sched_

Re: [PATCH 2/2] drm: Move drm_gem ioctl kerneldoc to uapi file

2025-07-14 Thread Christian König
On 11.07.25 23:55, Simona Vetter wrote: > On Fri, Jul 11, 2025 at 10:53:42AM -0400, David Francis wrote: >> The drm_gem ioctls were documented in internal file drm_gem.c >> instead of uapi header drm.h. Move them there and change to >> appropriate kerneldoc formatting. >> >> Signed-off-by: David Fr

Re: [PATCH v5 2/8] drm/sched: Allow drivers to skip the reset and keep on running

2025-07-14 Thread Christian König
On 13.07.25 21:03, Maíra Canal wrote: > Hi Christian, > > On 11/07/25 12:20, Christian König wrote: >> On 11.07.25 15:37, Philipp Stanner wrote: >>> On Fri, 2025-07-11 at 15:22 +0200, Christian König wrote: >>>> >>>> >>>> On 08.07.25

Re: [PATCH v5 2/8] drm/sched: Allow drivers to skip the reset and keep on running

2025-07-14 Thread Christian König
On 11.07.25 19:23, Matthew Brost wrote: > On Fri, Jul 11, 2025 at 05:20:49PM +0200, Christian König wrote: >> On 11.07.25 15:37, Philipp Stanner wrote: >>> On Fri, 2025-07-11 at 15:22 +0200, Christian König wrote: >>>> >>>> >>>> On 08.07.25

Re: [PATCH v5 2/8] drm/sched: Allow drivers to skip the reset and keep on running

2025-07-11 Thread Christian König
On 11.07.25 15:37, Philipp Stanner wrote: > On Fri, 2025-07-11 at 15:22 +0200, Christian König wrote: >> >> >> On 08.07.25 15:25, Maíra Canal wrote: >>> When the DRM scheduler times out, it's possible that the GPU isn't hung; >>> instead, a job

Re: [PATCH] drm/scheduler: Fix sched hang when killing app with dependent jobs

2025-07-11 Thread Christian König
On 11.07.25 15:13, Philipp Stanner wrote: > On Thu, 2025-07-10 at 08:33 +, cao, lin wrote: >> >> [AMD Official Use Only - AMD Internal Distribution Only] >> >> >> >> Hi Christian, >> >> >> Thanks for your suggestion, I modified the patch as: > > Looks promising. You'll send a v2 I guess :) We

Re: [PATCH v5 2/8] drm/sched: Allow drivers to skip the reset and keep on running

2025-07-11 Thread Christian König
On 08.07.25 15:25, Maíra Canal wrote: > When the DRM scheduler times out, it's possible that the GPU isn't hung; > instead, a job just took unusually long (longer than the timeout) but is > still running, and there is, thus, no reason to reset the hardware. This > can occur in two scenarios: >

Re: [PATCH 2/9] Revert "drm/gem: Acquire references on GEM handles for framebuffers"

2025-07-11 Thread Christian König
On 11.07.25 12:08, Simona Vetter wrote: > On Fri, Jul 11, 2025 at 11:35:17AM +0200, Thomas Zimmermann wrote: >> This reverts commit 5307dce878d4126e1b375587318955bd019c3741. >> >> We're going to revert the dma-buf handle back to separating dma_buf >> and import_attach->dmabuf in struct drm_gem_obje

Re: [PATCH 0/9] drm: Revert general use of struct drm_gem_object.dma_buf

2025-07-11 Thread Christian König
framebuffers") is conceptionally broken. Linus still notices boot-up > hangs that might be related. Did I missed that response? What exactly is the issue? > Reverting the whole thing is the only sensible action here. Feel free to add Acked-by: Christian König to the entire serie

[PATCH 1/2] drm/ttm: fix locking in test ttm_bo_validate_no_placement_signaled

2025-07-10 Thread Christian König
The test works even without it, but lockdep starts screaming when it is activated. Trivially fix it by acquiring the lock before we try to allocate something. Signed-off-by: Christian König --- drivers/gpu/drm/ttm/tests/ttm_bo_validate_test.c | 9 + 1 file changed, 5 insertions(+), 4

[PATCH 2/2] drm/ttm: remove ttm_bo_validate_swapout test

2025-07-10 Thread Christian König
etely remove the test. We already validate swapout on the device level and that test seems to be stable. Signed-off-by: Christian König --- .../gpu/drm/ttm/tests/ttm_bo_validate_test.c | 51 --- 1 file changed, 51 deletions(-) diff --git a/drivers/gpu/drm/ttm/

Re: [PATCH v6 5/5] drm/amdgpu: do not resume device in thaw for normal hibernation

2025-07-10 Thread Christian König
On 10.07.25 14:13, Mario Limonciello wrote: > On 7/10/2025 2:23 AM, Samuel Zhang wrote: >> For normal hibernation, GPU do not need to be resumed in thaw since it is >> not involved in writing the hibernation image. Skip resume in this case >> can reduce the hibernation time. >> >> On VM with 8 * 19

Re: [PATCH v4 1/9] drm: Add a vendor-specific recovery method to device wedged uevent

2025-07-10 Thread Christian König
On 10.07.25 11:01, Simona Vetter wrote: > On Wed, Jul 09, 2025 at 12:52:05PM -0400, Rodrigo Vivi wrote: >> On Wed, Jul 09, 2025 at 05:18:54PM +0300, Raag Jadav wrote: >>> On Wed, Jul 09, 2025 at 04:09:20PM +0200, Christian König wrote: >>>> On 09.07.25 15:41, Simona V

Re: [PATCH] drm/scheduler: Fix sched hang when killing app with dependent jobs

2025-07-10 Thread Christian König
First of all you need to CC the scheduler maintainers, try to use the get_maintainer.pl script. Adding them on CC. On 10.07.25 08:36, Lin.Cao wrote: > When Application A submits jobs (a1, a2, a3) and application B submits > job b1 with a dependency on a2's scheduler fence, killing application A >

Re: [PATCH v4 1/9] drm: Add a vendor-specific recovery method to device wedged uevent

2025-07-09 Thread Christian König
thod 'WEDGED=vendor-specific' for such errors. Vendors >> must provide additional recovery documentation if this method >> is used. >> >> v2: fix documentation (Raag) >> >> Cc: André Almeida >> Cc: Christian König >> Cc: David Airl

Re: [PATCH v4 1/5] drm/ttm: add new api ttm_device_prepare_hibernation()

2025-07-09 Thread Christian König
On 09.07.25 08:44, Samuel Zhang wrote: > This new api is used for hibernation to move GTT BOs to shmem after > VRAM eviction. shmem will be flushed to swap disk later to reduce > the system memory usage for hibernation. > > Signed-off-by: Samuel Zhang Reviewed-by:

Re: [PATCH v6 14/15] drm/sched: Queue all free credits in one worker invocation

2025-07-09 Thread Christian König
On 08.07.25 17:31, Tvrtko Ursulin wrote: > > On 08/07/2025 14:02, Christian König wrote: >> On 08.07.25 14:54, Tvrtko Ursulin wrote: >>> >>> On 08/07/2025 13:37, Christian König wrote: >>>> On 08.07.25 11:51, Tvrtko Ursulin wrote: >>>>> T

Re: [PATCH v2 1/1] drm/amdkfd: return -ENOTTY for unsupported IOCTLs

2025-07-09 Thread Christian König
On 09.07.25 06:56, Lazar, Lijo wrote: > On 7/8/2025 8:40 PM, Deucher, Alexander wrote: >> [Public] >> >> >> I seem to recall -ENOTSUPP being frowned upon for IOCTLs. >> >> > Going by documentation - > https://dri.freedesktop.org/docs/drm/gpu/drm-uapi.html > Good point. > EOPNOTSUPP: > Feature (li

Re: [PATCH] drm/amdgpu: fix the logic to validate fpriv and root bo

2025-07-09 Thread Christian König
On 09.07.25 09:16, Sunil Khatri wrote: > Fix the smatch warning, > smatch warnings: > drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c:2146 amdgpu_pt_info_read() > error: we previously assumed 'fpriv' could be null (see line 2146) > > "if (!fpriv && !fpriv->vm.root.bo)", It has to be an OR condition >

Re: [PATCH] dma_buf/sync_file: Enable signaling for fences when querying status

2025-07-08 Thread Christian König
On 08.07.25 14:03, Mikko Perttunen wrote: > From: Mikko Perttunen > > dma_fence_get_status is not guaranteed to return valid information > on if the fence has been signaled or not if SW signaling has not > been enabled for the fence. To ensure valid information is reported, > enable SW signaling

Re: [PATCH v6 14/15] drm/sched: Queue all free credits in one worker invocation

2025-07-08 Thread Christian König
On 08.07.25 14:54, Tvrtko Ursulin wrote: > > On 08/07/2025 13:37, Christian König wrote: >> On 08.07.25 11:51, Tvrtko Ursulin wrote: >>> There is no reason to queue just a single job if scheduler can take more >>> and re-queue the worker to queue more. >

Re: [PATCH v6 14/15] drm/sched: Queue all free credits in one worker invocation

2025-07-08 Thread Christian König
-off-by: Tvrtko Ursulin > Cc: Christian König > Cc: Danilo Krummrich > Cc: Matthew Brost > Cc: Philipp Stanner > --- > drivers/gpu/drm/scheduler/sched_internal.h | 2 - > drivers/gpu/drm/scheduler/sched_main.c | 132 ++--- > drivers/gpu/drm/schedul

Re: [PATCH] drm/amdgpu: fix MQD debugfs undefined symbol when DEBUG_FS=n

2025-07-08 Thread Christian König
On 08.07.25 12:15, Sunil Khatri wrote: > Fix undefined reference to amdgpu_mqd_info_fops during > debugfs_create_file if DEBUG_FS=n > > Signed-off-by: Sunil Khatri Yeah, that's exactly the reason why I wanted to put this into amdgpu_debugfs.c. For now Reviewed-by: Christi

Re: [PATCH] dma-buf: Take a breath during dma-fence-chain subtests

2025-07-08 Thread Christian König
On 08.07.25 10:56, Janusz Krzysztofik wrote: >> >> There is no reason to test enabling signaling each of the element in a loop. >> So there should be something like 4096 calls to the dma_fence_chain_cb >> function each jumping to the next unsignaled fence and re-installing the >> callback. > >

Re: [PATCH v3 2/5] drm/amdgpu: move GTT to shmem after eviction for hibernation

2025-07-08 Thread Christian König
KMD, then shmem to swap disk in kernel > hibernation code to make room for hibernation image. > > Signed-off-by: Samuel Zhang Reviewed-by: Christian König > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 10 +- > 1 file changed, 9 insertions(+), 1 deletion(-) >

Re: [PATCH v3 1/5] drm/ttm: add new api ttm_device_prepare_hibernation()

2025-07-08 Thread Christian König
On 08.07.25 09:42, Samuel Zhang wrote: > This new api is used for hibernation to move GTT BOs to shmem after > VRAM eviction. shmem will be flushed to swap disk later to reduce > the system memory usage for hibernation. > > Signed-off-by: Samuel Zhang > --- > drivers/gpu/drm/ttm/ttm_device.c

Re: [PATCH v3 2/3] drm/amdgpu: Reset the clear flag in buddy during resume

2025-07-08 Thread Christian König
ng able to flip the state either way would be > good. (Matthew Brost) > > v3(Matthew Auld): > - Do merge step first to avoid the use of extra reset flag. > > Signed-off-by: Arunpravin Paneer Selvam > Suggested-by: Christian König > Cc: sta...@vger.kernel.org > Fi

Re: [PATCH v3 1/3] drm/amdgpu: Add WARN_ON to the resource clear function

2025-07-08 Thread Christian König
; - Add back the resource clear flag set function call after > being wiped during eviction (Christian). > - Modified the patch subject name. > > Signed-off-by: Arunpravin Paneer Selvam > Suggested-by: Christian König > Cc: sta...@vger.kernel.org > Fixes: a68c7eaa7a

Re: [PATCH v2 1/1] drm/amdkfd: return -ENOTTY for unsupported IOCTLs

2025-07-08 Thread Christian König
On 08.07.25 06:22, Geoffrey McRae wrote: > Some kfd ioctls may not be available depending on the kernel version the > user is running, as such we need to report -ENOTTY so userland can > determine the cause of the ioctl failure. In general sounds like a good idea, but ENOTTY is potentially a bit m

Re: [PATCH 1/2] drm/ttm: rename ttm_bo_put to _fini

2025-07-07 Thread Christian König
On 07.07.25 18:25, Matthew Brost wrote: > On Mon, Jul 07, 2025 at 02:38:07PM +0200, Christian König wrote: >> On 03.07.25 00:01, Matthew Brost wrote: >>>> diff --git a/drivers/gpu/drm/ttm/tests/ttm_bo_test.c >>>> b/drivers/gpu/drm/ttm/tests/ttm_bo_test.c >

Re: [PATCH v3] drm/framebuffer: Acquire internal references on GEM handles

2025-07-07 Thread Christian König
sted-by: Bert Karwatzki > Tested-by: Mario Limonciello > Cc: Thomas Zimmermann > Cc: Anusha Srivatsa > Cc: Christian König > Cc: Maarten Lankhorst > Cc: Maxime Ripard > Cc: Sumit Semwal > Cc: "Christian König" > Cc: linux-me...@vger.kernel.org > Cc: dri

Re: [PATCH] dma-buf: Take a breath during dma-fence-chain subtests

2025-07-07 Thread Christian König
On 07.07.25 14:25, Janusz Krzysztofik wrote: > Hi Christian, > > I've taken over that old issue and have a few questions to you. Thanks a lot for that, something really fishy seems to be going on here. > On Thursday, 27 February 2025 15:11:39 CEST Christian König wrote: ... &

Re: [PATCH 1/2] drm/ttm: rename ttm_bo_put to _fini

2025-07-07 Thread Christian König
On 03.07.25 00:01, Matthew Brost wrote: >> diff --git a/drivers/gpu/drm/ttm/tests/ttm_bo_test.c >> b/drivers/gpu/drm/ttm/tests/ttm_bo_test.c >> index 6c77550c51af..5426b435f702 100644 >> --- a/drivers/gpu/drm/ttm/tests/ttm_bo_test.c >> +++ b/drivers/gpu/drm/ttm/tests/ttm_bo_test.c >> @@ -379,7 +37

Re: Switching TTM over to GEM refcounts v2

2025-07-07 Thread Christian König
On 02.07.25 18:15, Matthew Brost wrote: > On Wed, Jul 02, 2025 at 01:00:26PM +0200, Christian König wrote: >> Hi everyone, >> >> v2 of this patch set. I've either pushed or removed the other >> patches from v1, so only two remain. >> >> Pretty straight

Re: WARNING: drivers/gpu/drm/drm_gem.c:286 at drm_gem_object_handle_put_unlocked+0xb1/0xf0 [drm]

2025-07-07 Thread Christian König
On 07.07.25 11:30, Borislav Petkov wrote: > Hi all, > > I see the below on -rc5 + tip, on a RN machine. Yeah, that's an known issue. Thomas and I are working on that. Regards, Christian. > > --- > > [5.592468] cdc_ncm 2-2:2.0 eth0: register 'cdc_ncm' at > usb-:03:00.3-2, CDC NCM (NO

Re: [PATCH v2] drm/framebuffer: Acquire internal references on GEM handles

2025-07-07 Thread Christian König
Fixes: 5307dce878d4 ("drm/gem: Acquire references on GEM handles for > framebuffers") > Reported-by: Bert Karwatzki > Closes: > https://lore.kernel.org/dri-devel/20250703115915.3096-1-spassw...@web.de/ > Tested-by: Bert Karwatzki > Tested-by: Mario Limonciello > Cc: Thomas

Re: [PATCH v2 2/5] drm/amdgpu: move GTT to shmem after eviction for hibernation

2025-07-07 Thread Christian König
On 04.07.25 12:12, Samuel Zhang wrote: > When hibernate with data center dGPUs, huge number of VRAM BOs evicted > to GTT and takes too much system memory. This will cause hibernation > fail due to insufficient memory for creating the hibernation image. > > Move GTT BOs to shmem in KMD, then shm

Re: [PATCH v2 1/5] drm/ttm: add ttm_device_prepare_hibernation() api

2025-07-07 Thread Christian König
On 04.07.25 12:12, Samuel Zhang wrote: > This new api is used for hibernation to move GTT BOs to shmem after > VRAM eviction. shmem will be flushed to swap disk later to reduce > the system memory usage for hibernation. > > Signed-off-by: Samuel Zhang > --- > drivers/gpu/drm/ttm/ttm_device.c | 2

Re: [PATCH] drm/framebuffer: Acquire internal references on GEM handles

2025-07-04 Thread Christian König
On 04.07.25 14:31, Thomas Zimmermann wrote: > Hi > > Am 04.07.25 um 14:06 schrieb Christian König: >> On 04.07.25 10:53, Thomas Zimmermann wrote: >>> Acquire GEM handles in drm_framebuffer_init() and release them in >>> the corresponding drm_framebuffer_cleanup(

Re: [PATCH] drm/framebuffer: Acquire internal references on GEM handles

2025-07-04 Thread Christian König
igned-off-by: Thomas Zimmermann > Fixes: 5307dce878d4 ("drm/gem: Acquire references on GEM handles for > framebuffers") > Reported-by: Bert Karwatzki > Closes: > https://lore.kernel.org/dri-devel/20250703115915.3096-1-spassw...@web.de/ > Tested-by: Bert Karwatzk

Re: [PATCH 17/17] amdgpu: add support for memory cgroups

2025-07-04 Thread Christian König
On 03.07.25 23:22, David Airlie wrote: >> >> Do you mean a task in cgroup A does amdgpu_gem_object_create() and then >> the actual allocation can happen in the task in cgroup B? > > On android and in some graphics scenarios, this might happen, not sure > if it does always though. We have scenarios

Re: [PATCH] drm/amdgpu: Fix lifetime of struct amdgpu_task_info after ring reset

2025-07-04 Thread Christian König
rg/dri-devel/CAPM=9tz0rQP8VZWKWyuF8kUMqRScxqoa6aVdwWw9=5yyxyy...@mail.gmail.com/ > Signed-off-by: André Almeida Reviewed-by: Christian König > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 5 ++--- > 1 file changed, 2 insertions(+), 3 deletions(-) > > diff --git a/drivers/

Re: [PATCH 17/17] amdgpu: add support for memory cgroups

2025-07-03 Thread Christian König
On 03.07.25 19:58, Shakeel Butt wrote: > On Thu, Jul 03, 2025 at 12:53:44PM +1000, David Airlie wrote: >> On Thu, Jul 3, 2025 at 2:03 AM Shakeel Butt wrote: >>> >>> On Mon, Jun 30, 2025 at 02:49:36PM +1000, Dave Airlie wrote: From: Dave Airlie This adds support for adding a obj cgr

Re: [RFC 00/12] io_uring dmabuf read/write support

2025-07-03 Thread Christian König
On 03.07.25 16:23, Christoph Hellwig wrote: > [Note: it would be really useful to Cc all relevant maintainers] > > On Fri, Jun 27, 2025 at 04:10:27PM +0100, Pavel Begunkov wrote: >> This series implements it for read/write io_uring requests. The uAPI >> looks similar to normal registered buffers,

Re: Warnings in next-20250703 caused by commit 582111e630f5

2025-07-03 Thread Christian König
On 03.07.25 15:54, Thomas Zimmermann wrote: > Hi > > Am 03.07.25 um 15:45 schrieb Christian König: >> On 03.07.25 15:37, Thomas Zimmermann wrote: >>> Hi >>> >>> Am 03.07.25 um 13:59 schrieb Bert Karwatzki: >>>> When booting next-20250703 on

Re: Warnings in next-20250703 caused by commit 582111e630f5

2025-07-03 Thread Christian König
On 03.07.25 15:37, Thomas Zimmermann wrote: > Hi > > Am 03.07.25 um 13:59 schrieb Bert Karwatzki: >> When booting next-20250703 on my Msi Alpha 15 Laptop running debian sid (last >> updated 20250703) I get a several warnings of the following kind: >> >> [    8.702999] [   T1628] [

Re: [PATCH v1 1/3] drm/buddy: add a flag to disable trimming of non cleared blocks

2025-07-03 Thread Christian König
On 02.07.25 18:12, Pierre-Eric Pelloux-Prayer wrote: > A vkcts test case is triggering a case where the drm buddy allocator > wastes lots of memory and performs badly: > > dEQP-VK.memory.allocation.basic.size_8KiB.reverse.count_4000 > > For each memory pool type, the test will allocate 4000 8kB

Re: [PATCH v9 3/4] drm/amdgpu: add debugfs support for VM pagetable per client

2025-07-03 Thread Christian König
0 for success, error for failure. > */ > int amdgpu_vm_init(struct amdgpu_device *adev, struct amdgpu_vm *vm, > -int32_t xcp_id) > +int32_t xcp_id, struct drm_file *file) > { > struct amdgpu_bo *root_bo; > struct amdgpu_bo_vm *ro

Re: [PATCH v2 1/3] drm/amdgpu: Dirty cleared blocks on free

2025-07-02 Thread Christian König
On 02.07.25 13:58, Arunpravin Paneer Selvam wrote: > Hi Christian, > > On 7/2/2025 1:27 PM, Christian König wrote: >> On 01.07.25 21:08, Arunpravin Paneer Selvam wrote: >>> Set the dirty bit when the memory resource is not cleared >>> during BO release. >&g

[PATCH 2/2] drm/ttm: replace TTMs refcount with the DRM refcount v2

2025-07-02 Thread Christian König
re-enable disabled test Signed-off-by: Christian König --- .../gpu/drm/ttm/tests/ttm_bo_validate_test.c | 8 +- drivers/gpu/drm/ttm/tests/ttm_kunit_helpers.c | 2 - drivers/gpu/drm/ttm/ttm_bo.c | 148 +- drivers/gpu/drm/ttm/ttm_bo_internal.h | 9

[PATCH 1/2] drm/ttm: rename ttm_bo_put to _fini

2025-07-02 Thread Christian König
Give TTM BOs a separate cleanup function. The next step in removing the TTM BO reference counting and replacing it with the GEM object reference counting. Signed-off-by: Christian König --- drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c | 2 +- drivers/gpu/drm/drm_gem_vram_helper.c

Switching TTM over to GEM refcounts v2

2025-07-02 Thread Christian König
Hi everyone, v2 of this patch set. I've either pushed or removed the other patches from v1, so only two remain. Pretty straight forward conversation and shouldn't result in any visible technical difference. Please review and/or comment. Regards, Christian.

Re: [PATCH][next] drm/ttm: remove redundant ternaray operation on ret

2025-07-02 Thread Christian König
On 02.07.25 11:43, Colin King (gmail) wrote: > On 02/07/2025 10:42, Robert P. J. Day wrote: >> >>    subject has typo, should be "ternary" >> >> rday > > Good catch. Can that be fixed up before applying the patch rather than me > sending a V2? Thomas or me can take care of that before pushing.

Re: [PATCH v2 1/3] drm/amdgpu: Dirty cleared blocks on free

2025-07-02 Thread Christian König
ff-by: Arunpravin Paneer Selvam > Suggested-by: Christian König > Cc: sta...@vger.kernel.org > Fixes: a68c7eaa7a8f ("drm/amdgpu: Enable clear page functionality") > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 1 - > drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mg

Re: [PATCH 12/17] ttm: add objcg pointer to bo and tt

2025-07-02 Thread Christian König
On 02.07.25 09:57, David Airlie wrote: >>> >>> It makes it easier now, but when we have to solve swapping, step one >>> will be moving all this code around to what I have now, and starting >>> from there. >>> >>> This just raises the bar to solving the next problem. >>> >>> We need to find incremen

Re: [PATCH 1/3] drm/amdgpu: move GTT to SHM after eviction for hibernation

2025-07-02 Thread Christian König
On 02.07.25 09:28, Samuel Zhang wrote: > > On 2025/7/1 16:22, Christian König wrote: >> On 01.07.25 10:18, Zhang, GuoQing (Sam) wrote: >>> [AMD Official Use Only - AMD Internal Distribution Only] >>> >>> >>> Hi Christian, >>> >>>  

Re: [PATCH 12/17] ttm: add objcg pointer to bo and tt

2025-07-02 Thread Christian König
On 02.07.25 00:11, David Airlie wrote: > On Tue, Jul 1, 2025 at 6:16 PM Christian König > wrote: >> >> On 01.07.25 10:06, David Airlie wrote: >>> On Tue, Jul 1, 2025 at 5:22 PM Christian König >>> wrote: >>>>>>> diff --git a/include/

Re: [PATCH] drm/gem-shmem: Do not map s/g table by default

2025-07-02 Thread Christian König
> Signed-off-by: Thomas Zimmermann > Reported-by: Zenghui Yu > Closes: > https://lore.kernel.org/dri-devel/6d22bce3-4533-4cfa-96ba-64352b715...@linux.dev/ > # [1] > Reported-by: José Expósito > Closes: > https://lore.kernel.org/dri-devel/20250311172054.2903-1-jose.exposit...@gmail.com

Re: [PATCH v2 1/5] drm: Add a firmware flash method to device wedged uevent

2025-07-01 Thread Christian König
On 01.07.25 16:23, Raag Jadav wrote: > On Tue, Jul 01, 2025 at 05:11:24PM +0530, Riana Tauro wrote: >> On 7/1/2025 5:07 PM, Riana Tauro wrote: >>> On 6/30/2025 11:03 PM, Rodrigo Vivi wrote: >>>> On Mon, Jun 30, 2025 at 10:29:10AM +0200, Christian König wrote: >>

Re: [PATCH 3/3] drm/amdgpu: skip kfd resume_process for dev_pm_ops.thaw()

2025-07-01 Thread Christian König
at approach here looks fishy to me, but I don't know how to properly fix it either. @Alex any idea? Regards, Christian. > > > Regards > Sam > > > On 2025/6/30 19:58, Christian König wrote: >> On 30.06.25 12:41, Samuel Zhang wrote: >>> The hibernation su

Re: [PATCH 1/3] drm/amdgpu: move GTT to SHM after eviction for hibernation

2025-07-01 Thread Christian König
On 01.07.25 10:18, Zhang, GuoQing (Sam) wrote: > [AMD Official Use Only - AMD Internal Distribution Only] > > > Hi Christian, > >   > > Thank you for the feedback. > >   > > For “return ret < 0 ? ret : 0;”, it is equivalent to “return ret;” since ret > is always <= 0 after the loop. No it i

Re: [PATCH 12/17] ttm: add objcg pointer to bo and tt

2025-07-01 Thread Christian König
On 01.07.25 10:06, David Airlie wrote: > On Tue, Jul 1, 2025 at 5:22 PM Christian König > wrote: >>>>> diff --git a/include/drm/ttm/ttm_tt.h b/include/drm/ttm/ttm_tt.h >>>>> index 15d4019685f6..c13fea4c2915 100644 >>>>> --- a/include/dr

Re: [PATCH] drm/sched: Increment job count before swapping tail spsc queue

2025-07-01 Thread Christian König
quot;drm: move amd_gpu_scheduler into common location") > Fixes: 27105db6c63a ("drm/amdgpu: Add SPSC queue to scheduler.") > Cc: sta...@vger.kernel.org > Signed-off-by: Matthew Brost Sorry for the late response, if it isn't already pushed to drm-misc-fixes then feel free to

Re: [PATCH 12/17] ttm: add objcg pointer to bo and tt

2025-07-01 Thread Christian König
On 30.06.25 23:33, David Airlie wrote: > On Mon, Jun 30, 2025 at 8:24 PM Christian König > wrote: >> >> On 30.06.25 06:49, Dave Airlie wrote: >>> From: Dave Airlie >>> >>> This just adds the obj cgroup pointer to the bo and tt structs, >>>

Re: [PATCH v7 1/5] drm: move the debugfs accel driver code to drm layer

2025-06-30 Thread Christian König
On 30.06.25 16:36, Sunil Khatri wrote: > Move the debugfs accel driver code to the drm layer > and it is an intermediate step to move all debugfs > related handling into drm_debugfs.c > > Signed-off-by: Sunil Khatri > Reviewed-by: Christian König > --- > driver

Re: [PATCH] drm/ttm: Remove unneeded blank line in comment

2025-06-30 Thread Christian König
: 718370ff28328 ("drm/ttm: Add ttm_bo_kmap_try_from_panic()") > Signed-off-by: Jocelyn Falempe Reviewed-by: Christian König > --- > > Can this be merged through the drm-intel-next, as this is were the > offending commit was merged. > > drivers/gpu/drm/t

Re: [PATCH v6 2/5] drm: move debugfs functionality from drm_drv.c to drm_debugfs.c

2025-06-30 Thread Christian König
On 30.06.25 15:34, Khatri, Sunil wrote: > > On 6/30/2025 5:11 PM, Christian König wrote: >> >> On 27.06.25 11:49, Sunil Khatri wrote: >>> move the debugfs functions from drm_drv.c to drm_debugfs.c >>> >>> move this root node to the debugfs for easi

Re: [PATCH v6 5/5] drm/amdgpu: add support of debugfs for mqd information

2025-06-30 Thread Christian König
On 27.06.25 11:49, Sunil Khatri wrote: > Add debugfs support for mqd for each queue of the client. > > The address exposed to debugfs could be used to dump > the mqd. > > Signed-off-by: Sunil Khatri > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_userq.c | 52 +++ > drivers/gpu/dr

Re: [PATCH v6 4/5] drm/amdgpu: add debugfs support for VM pagetable per client

2025-06-30 Thread Christian König
On 27.06.25 11:49, Sunil Khatri wrote: > Add a debugfs file under the client directory which shares > the root page table base address of the VM. > > This address could be used to dump the pagetable for debug > memory issues. > > Signed-off-by: Sunil Khatri > --- > drivers/gpu/drm/amd/amdgpu

Re: [PATCH v6 3/5] drm: add debugfs support on per client-id basis

2025-06-30 Thread Christian König
; > Also create a debugfs file which show the process > information for the client and create a symlink back > to the parent drm device from each client. > > Signed-off-by: Sunil Khatri Reviewed-by: Christian König > --- > drivers/gpu/drm/drm_debugfs.c | 80 +++

Re: [PATCH 3/3] drm/amdgpu: skip kfd resume_process for dev_pm_ops.thaw()

2025-06-30 Thread Christian König
On 30.06.25 12:41, Samuel Zhang wrote: > The hibernation successful workflow: > - prepare: evict VRAM and swapout GTT BOs > - freeze > - create the hibernation image in system memory > - thaw: swapin and restore BOs Why should a thaw happen here in between? > - complete > - write hibernation imag

Re: [PATCH 1/3] drm/amdgpu: move GTT to SHM after eviction for hibernation

2025-06-30 Thread Christian König
On 30.06.25 12:41, Samuel Zhang wrote: > When hibernate with data center dGPUs, huge number of VRAM BOs evicted > to GTT and takes too much system memory. This will cause hibernation > fail due to insufficient memory for creating the hibernation image. > > Move GTT BOs to shmem in KMD, then shmem

  1   2   3   4   5   6   7   8   9   10   >