Re: [PATCH v2] drm/amdkfd: Move the process suspend and resume out of full access

2025-06-03 Thread Lazar, Lijo
On 5/27/2025 4:19 PM, Emily Deng wrote: > For the suspend and resume process, exclusive access is not required. > Therefore, it can be moved out of the full access section to reduce the > duration of exclusive access. > > Signed-off-by: Emily Deng > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_amd

RE: [PATCH] drm/amdgpu: Clear reset flags from ras context

2025-06-03 Thread Zhang, Hawking
[AMD Official Use Only - AMD Internal Distribution Only] Reviewed-by: Hawking Zhang Regards, Hawking -Original Message- From: Lazar, Lijo Sent: Wednesday, June 4, 2025 12:10 To: amd-gfx@lists.freedesktop.org Cc: Zhang, Hawking ; Deucher, Alexander ; Zhou1, Tao Subject: [PATCH] drm/amd

[PATCH] drm/amdgpu: Clear reset flags from ras context

2025-06-03 Thread Lijo Lazar
Once RAS errors are cleared with appropriate recovery mechanism, clear reset flags also from RAS context. Otherwise, stale flag values could affect the subsequent RAS reset handling on the device. Signed-off-by: Lijo Lazar --- drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 4 +++- 1 file changed, 3 i

RE: [PATCH v2] drm/amdkfd: Move the process suspend and resume out of full access

2025-06-03 Thread Deng, Emily
[AMD Official Use Only - AMD Internal Distribution Only] Ping.. Emily Deng Best Wishes >-Original Message- >From: Deng, Emily >Sent: Tuesday, June 3, 2025 5:11 PM >To: Koenig, Christian ; Chen, Horace > >Cc: amd-gfx@lists.freedesktop.org; Zhang, Owen(SRDC) > >Subject: RE: [PATCH v2

RE: [PATCH v3] drm/ttm: Should to return the evict error

2025-06-03 Thread Deng, Emily
[AMD Official Use Only - AMD Internal Distribution Only] Hi Christian, Could you help to review again? Emily Deng Best Wishes >-Original Message- >From: Emily Deng >Sent: Tuesday, June 3, 2025 5:12 PM >To: amd-gfx@lists.freedesktop.org >Cc: Deng, Emily >Subject: [PATCH v3] drm/t

RE: [PATCH] drm/amdkfd: fix mes based process evictions

2025-06-03 Thread Kim, Jonathan
[Public] > -Original Message- > From: Kasiviswanathan, Harish > Sent: Tuesday, June 3, 2025 5:22 PM > To: Kim, Jonathan ; amd-gfx@lists.freedesktop.org > Cc: Kuehling, Felix ; Joshi, Mukul > ; Kim, Jonathan > Subject: RE: [PATCH] drm/amdkfd: fix mes based process evictions > > [Public] >

RE: [PATCH] drm/amdkfd: fix mes based process evictions

2025-06-03 Thread Kasiviswanathan, Harish
[Public] So, the code now loops two times over the list of queues. Couple of questions. (1) Isn't it possible to call suspend_all_queues_mes(dqm) before the first list_for_each_entry()? The first loop only does some housekeeping. That way you can still do get away with single loop. (2) Also rem

Re: [PATCH v4 2/9] dma-fence: Use a flag for 64-bit seqnos

2025-06-03 Thread Christian König
On 6/3/25 17:00, Tvrtko Ursulin wrote: > > On 03/06/2025 14:13, Maxime Ripard wrote: >> Hi, >> >> On Mon, Jun 02, 2025 at 04:42:27PM +0200, Christian König wrote: >>> On 6/2/25 15:05, Tvrtko Ursulin wrote: On 15/05/2025 14:15, Christian König wrote: > Hey drm-misc maintainers, > >>

Re: [PATCH V8 32/43] drm/colorop: Add 1D Curve Custom LUT type

2025-06-03 Thread Harry Wentland
On 2025-06-03 06:51, Pekka Paalanen wrote: > On Tue, 3 Jun 2025 08:30:23 + > "Shankar, Uma" wrote: > >>> -Original Message- >>> From: Pekka Paalanen >>> Sent: Friday, May 30, 2025 7:28 PM >>> To: Shankar, Uma >>> Cc: Simon Ser ; Harry Wentland >>> ; Alex Hung ; dri- >>> de...@lis

RE: [PATCH] drm/amdkfd: Allow device error to be logged

2025-06-03 Thread Kasiviswanathan, Harish
[Public] -Original Message- From: Clement, Sunday Sent: Friday, May 30, 2025 11:17 AM To: amd-gfx@lists.freedesktop.org Cc: Tudor, Alexandru ; Kasiviswanathan, Harish ; Clement, Sunday Subject: [PATCH] drm/amdkfd: Allow device error to be logged The addition of a WARN_ON() check in ord

RE: [PATCH] drm/amdkfd: Move SDMA queue reset capability check to node_show

2025-06-03 Thread Kim, Jonathan
[Public] > -Original Message- > From: Jesse.Zhang > Sent: Monday, June 2, 2025 11:09 PM > To: amd-gfx@lists.freedesktop.org > Cc: Deucher, Alexander ; Koenig, Christian > ; Kuehling, Felix ; Kim, > Jonathan ; Zhang, Jesse(Jie) > Subject: [PATCH] drm/amdkfd: Move SDMA queue reset capabili

Re: [RFC PATCH 2/2] drm/amdgpu/uvd: Ensure vcpu bos are within the uvd segment

2025-06-03 Thread John Olender
On 6/3/25 12:26 PM, Christian König wrote: > > > On 6/3/25 16:34, John Olender wrote: Oh, that's a very interesting find. Could you try to turn around the way the patch works? E.g. instead of forcing the UVD FW into the first segment, change amdgpu_uvd_force_into_uvd_se

Re: [PATCH v4 2/9] dma-fence: Use a flag for 64-bit seqnos

2025-06-03 Thread Tvrtko Ursulin
On 03/06/2025 17:27, Christian König wrote: On 6/3/25 17:00, Tvrtko Ursulin wrote: On 03/06/2025 14:13, Maxime Ripard wrote: Hi, On Mon, Jun 02, 2025 at 04:42:27PM +0200, Christian König wrote: On 6/2/25 15:05, Tvrtko Ursulin wrote: On 15/05/2025 14:15, Christian König wrote: Hey drm-mis

Re: [PATCH 07/28] drm/amdgpu: track ring state associated with a job

2025-06-03 Thread Alex Deucher
On Tue, Jun 3, 2025 at 11:18 AM Alex Deucher wrote: > > On Tue, Jun 3, 2025 at 11:01 AM Christian König > wrote: > > > > On 6/3/25 16:27, Alex Deucher wrote: > > >>> I'm not quite sure I understand what you are proposing. Is the idea > > >>> to track all of the jobs associated with a particular

[PATCH] drm/amdkfd: fix mes based process evictions

2025-06-03 Thread Jonathan Kim
First, MES suspend/resume calls should be consistently held under the KFD device lock (similar to queue invalidation) so consolidate MES based eviction logic with F32 HWS based eviction logic. Second, save the last eviction timestamp prior to current eviction to align with F32 HWS timestamp loggin

Re: [RFC PATCH 2/2] drm/amdgpu/uvd: Ensure vcpu bos are within the uvd segment

2025-06-03 Thread Christian König
On 6/3/25 16:34, John Olender wrote: >>> Oh, that's a very interesting find. Could you try to turn around the way >>> the patch works? >>> >>> E.g. instead of forcing the UVD FW into the first segment, change >>> amdgpu_uvd_force_into_uvd_segment() so that the BOs are forced into the >>> same

Re: [PATCH v4 2/9] dma-fence: Use a flag for 64-bit seqnos

2025-06-03 Thread Christian König
On 6/3/25 15:13, Maxime Ripard wrote: > Hi, > > On Mon, Jun 02, 2025 at 04:42:27PM +0200, Christian König wrote: >> On 6/2/25 15:05, Tvrtko Ursulin wrote: >>> On 15/05/2025 14:15, Christian König wrote: Hey drm-misc maintainers, can you guys please backmerge drm-next into drm-misc-n

Re: [PATCH 07/28] drm/amdgpu: track ring state associated with a job

2025-06-03 Thread Alex Deucher
On Tue, Jun 3, 2025 at 11:01 AM Christian König wrote: > > On 6/3/25 16:27, Alex Deucher wrote: > >>> I'm not quite sure I understand what you are proposing. Is the idea > >>> to track all of the jobs associated with a particular process and then > >>> when we reset a queue, skip all of the ring

Re: [PATCH 07/28] drm/amdgpu: track ring state associated with a job

2025-06-03 Thread Christian König
On 6/3/25 16:27, Alex Deucher wrote: >>> I'm not quite sure I understand what you are proposing. Is the idea >>> to track all of the jobs associated with a particular process and then >>> when we reset a queue, skip all of the ring contents associated with >>> those jobs and signal and set the err

Re: [PATCH v4 2/9] dma-fence: Use a flag for 64-bit seqnos

2025-06-03 Thread Tvrtko Ursulin
On 03/06/2025 14:13, Maxime Ripard wrote: Hi, On Mon, Jun 02, 2025 at 04:42:27PM +0200, Christian König wrote: On 6/2/25 15:05, Tvrtko Ursulin wrote: On 15/05/2025 14:15, Christian König wrote: Hey drm-misc maintainers, can you guys please backmerge drm-next into drm-misc-next? I want to

RE: [PATCH V2] drm/amdkfd: add late initialization support for amdkfd device

2025-06-03 Thread Kim, Jonathan
[Public] > -Original Message- > From: Zhang, Jesse(Jie) > Sent: Monday, June 2, 2025 11:05 PM > To: Kim, Jonathan ; amd-gfx@lists.freedesktop.org > Cc: Deucher, Alexander ; Kuehling, Felix > > Subject: RE: [PATCH V2] drm/amdkfd: add late initialization support for amdkfd > device > > [Pu

Re: [RFC PATCH 2/2] drm/amdgpu/uvd: Ensure vcpu bos are within the uvd segment

2025-06-03 Thread John Olender
>> Oh, that's a very interesting find. Could you try to turn around the way the >> patch works? >> >> E.g. instead of forcing the UVD FW into the first segment, change >> amdgpu_uvd_force_into_uvd_segment() so that the BOs are forced into the same >> segment as the UVD firmware. >> I started im

Re: [PATCH 07/28] drm/amdgpu: track ring state associated with a job

2025-06-03 Thread Alex Deucher
On Tue, Jun 3, 2025 at 4:03 AM Christian König wrote: > > On 6/3/25 00:42, Alex Deucher wrote: > > On Mon, Jun 2, 2025 at 10:36 AM Christian König > > wrote: > >> > >> On 5/29/25 22:07, Alex Deucher wrote: > >>> We need to know the wptr and sequence number associated > >>> with a job so that we c

Re: [PATCH v4 2/9] dma-fence: Use a flag for 64-bit seqnos

2025-06-03 Thread Christian König
On 6/3/25 14:48, Tvrtko Ursulin wrote: > > On 03/06/2025 13:40, Christian König wrote: >> On 6/3/25 13:30, Tvrtko Ursulin wrote: >>> >>> On 02/06/2025 19:00, Christian König wrote: On 6/2/25 17:25, Tvrtko Ursulin wrote: > > On 02/06/2025 15:42, Christian König wrote: >> On 6/2/25

Re: [PATCH v3 2/5] PM: sleep: Suspend async parents after suspending children

2025-06-03 Thread Rafael J. Wysocki
On Tue, Jun 3, 2025 at 3:36 PM Chris Bainbridge wrote: > > On Tue, Jun 03, 2025 at 03:04:33PM +0200, Rafael J. Wysocki wrote: > > On Tue, Jun 3, 2025 at 2:27 PM Chris Bainbridge > > wrote: > > > > > > On Tue, 3 Jun 2025 at 13:24, Rafael J. Wysocki wrote: > > > > > > > > > > This patch does fix t

Re: [PATCH V8 32/43] drm/colorop: Add 1D Curve Custom LUT type

2025-06-03 Thread Pekka Paalanen
On Tue, 3 Jun 2025 08:30:23 + "Shankar, Uma" wrote: > > -Original Message- > > From: Pekka Paalanen > > Sent: Friday, May 30, 2025 7:28 PM > > To: Shankar, Uma > > Cc: Simon Ser ; Harry Wentland > > ; Alex Hung ; dri- > > de...@lists.freedesktop.org; amd-gfx@lists.freedesktop.org; i

Re: [PATCH v3 2/5] PM: sleep: Suspend async parents after suspending children

2025-06-03 Thread Chris Bainbridge
On Tue, 3 Jun 2025 at 13:24, Rafael J. Wysocki wrote: > > > > This patch does fix the list corruption, but the "Unbalanced > > pm_runtime_enable" still occurs: > > Have you applied it together with the previous patch? Yes

Re: [PATCH v3 2/5] PM: sleep: Suspend async parents after suspending children

2025-06-03 Thread Chris Bainbridge
On Tue, Jun 03, 2025 at 11:38:37AM +0200, Rafael J. Wysocki wrote: > > Chris, please check if the attached patch helps. I'm going to post it > as a fix anyway later today, but it would be good to verify that it is > sufficient. This did not fix my test case, pstore crash log was: <6>[ 100.6902

Re: [PATCH v4 2/9] dma-fence: Use a flag for 64-bit seqnos

2025-06-03 Thread Maxime Ripard
Hi, On Mon, Jun 02, 2025 at 04:42:27PM +0200, Christian König wrote: > On 6/2/25 15:05, Tvrtko Ursulin wrote: > > On 15/05/2025 14:15, Christian König wrote: > >> Hey drm-misc maintainers, > >> > >> can you guys please backmerge drm-next into drm-misc-next? > >> > >> I want to push this patch here

Re: [PATCH v3 2/5] PM: sleep: Suspend async parents after suspending children

2025-06-03 Thread Rafael J. Wysocki
On Tue, Jun 3, 2025 at 2:27 PM Chris Bainbridge wrote: > > On Tue, 3 Jun 2025 at 13:24, Rafael J. Wysocki wrote: > > > > > > This patch does fix the list corruption, but the "Unbalanced > > > pm_runtime_enable" still occurs: > > > > Have you applied it together with the previous patch? > > Yes S

RE: [PATCH] drm/amdgpu: Add more checks to PSP mailbox

2025-06-03 Thread Zhang, Hawking
[AMD Official Use Only - AMD Internal Distribution Only] RE - log error if waiting if waiting on a psp response fails/times out Please the redundant repetition of 'if waiting' from the commit description. The patch is Reviewed-by: Hawking Zhang Regards, Hawking -Original Message- From

Re: [RFC 20/23] cgroup/drm: Introduce weight based scheduling control

2025-06-03 Thread Michal Koutný
Hello. On Fri, May 02, 2025 at 01:32:53PM +0100, Tvrtko Ursulin wrote: > From: Tvrtko Ursulin > > Similar to CPU and IO scheduling, implement a concept of weight in the DRM > cgroup controller. > > Individual drivers are now able to register with the controller which will > notify them of the

Re: [PATCH v4 2/9] dma-fence: Use a flag for 64-bit seqnos

2025-06-03 Thread Tvrtko Ursulin
On 03/06/2025 13:40, Christian König wrote: On 6/3/25 13:30, Tvrtko Ursulin wrote: On 02/06/2025 19:00, Christian König wrote: On 6/2/25 17:25, Tvrtko Ursulin wrote: On 02/06/2025 15:42, Christian König wrote: On 6/2/25 15:05, Tvrtko Ursulin wrote: Hi, On 15/05/2025 14:15, Christian Kö

Re: [PATCH v4 2/9] dma-fence: Use a flag for 64-bit seqnos

2025-06-03 Thread Christian König
On 6/3/25 13:30, Tvrtko Ursulin wrote: > > On 02/06/2025 19:00, Christian König wrote: >> On 6/2/25 17:25, Tvrtko Ursulin wrote: >>> >>> On 02/06/2025 15:42, Christian König wrote: On 6/2/25 15:05, Tvrtko Ursulin wrote: > > Hi, > > On 15/05/2025 14:15, Christian König wrote: >

Re: [PATCH v3 2/5] PM: sleep: Suspend async parents after suspending children

2025-06-03 Thread Rafael J. Wysocki
On Tue, Jun 3, 2025 at 2:15 PM Chris Bainbridge wrote: > > On Tue, Jun 03, 2025 at 01:39:01PM +0200, Rafael J. Wysocki wrote: > > On Tue, Jun 3, 2025 at 1:37 PM Rafael J. Wysocki wrote: > > > > > > On Tue, Jun 3, 2025 at 12:30 PM Rafael J. Wysocki > > > wrote: > > > > > > > > On Tue, Jun 3, 202

Re: [PATCH v3 2/5] PM: sleep: Suspend async parents after suspending children

2025-06-03 Thread Rafael J. Wysocki
On Tue, Jun 3, 2025 at 1:37 PM Rafael J. Wysocki wrote: > > On Tue, Jun 3, 2025 at 12:30 PM Rafael J. Wysocki wrote: > > > > On Tue, Jun 3, 2025 at 12:29 PM Rafael J. Wysocki wrote: > > > > > > On Tue, Jun 3, 2025 at 12:17 PM Chris Bainbridge > > > wrote: > > > > > > > > On Tue, Jun 03, 2025 at

Re: [PATCH v3 2/5] PM: sleep: Suspend async parents after suspending children

2025-06-03 Thread Rafael J. Wysocki
On Tue, Jun 3, 2025 at 12:30 PM Rafael J. Wysocki wrote: > > On Tue, Jun 3, 2025 at 12:29 PM Rafael J. Wysocki wrote: > > > > On Tue, Jun 3, 2025 at 12:17 PM Chris Bainbridge > > wrote: > > > > > > On Tue, Jun 03, 2025 at 11:38:37AM +0200, Rafael J. Wysocki wrote: > > > > > > > > Chris, please c

Re: [PATCH 1/4] drm/sched: optimize drm_sched_job_add_dependency

2025-06-03 Thread Simona Vetter
On Wed, May 28, 2025 at 04:47:00PM +0200, Danilo Krummrich wrote: > On Wed, May 28, 2025 at 04:39:01PM +0200, Danilo Krummrich wrote: > > On Wed, May 28, 2025 at 09:29:30AM -0400, Alex Deucher wrote: > > > On Wed, May 28, 2025 at 8:45 AM Simona Vetter > > > wrote: > > > > I do occasionally find i

Re: [PATCH v4 2/9] dma-fence: Use a flag for 64-bit seqnos

2025-06-03 Thread Tvrtko Ursulin
On 02/06/2025 19:00, Christian König wrote: On 6/2/25 17:25, Tvrtko Ursulin wrote: On 02/06/2025 15:42, Christian König wrote: On 6/2/25 15:05, Tvrtko Ursulin wrote: Hi, On 15/05/2025 14:15, Christian König wrote: Hey drm-misc maintainers, can you guys please backmerge drm-next into drm

Re: [PATCH v3 2/5] PM: sleep: Suspend async parents after suspending children

2025-06-03 Thread Rafael J. Wysocki
On Tue, Jun 3, 2025 at 12:29 PM Rafael J. Wysocki wrote: > > On Tue, Jun 3, 2025 at 12:17 PM Chris Bainbridge > wrote: > > > > On Tue, Jun 03, 2025 at 11:38:37AM +0200, Rafael J. Wysocki wrote: > > > > > > Chris, please check if the attached patch helps. I'm going to post it > > > as a fix anywa

Re: [PATCH v3 2/5] PM: sleep: Suspend async parents after suspending children

2025-06-03 Thread Rafael J. Wysocki
On Tue, Jun 3, 2025 at 12:17 PM Chris Bainbridge wrote: > > On Tue, Jun 03, 2025 at 11:38:37AM +0200, Rafael J. Wysocki wrote: > > > > Chris, please check if the attached patch helps. I'm going to post it > > as a fix anyway later today, but it would be good to verify that it is > > sufficient. >

Re: [PATCH v3 2/5] PM: sleep: Suspend async parents after suspending children

2025-06-03 Thread Rafael J. Wysocki
On Mon, Jun 2, 2025 at 9:58 PM Rafael J. Wysocki wrote: > > On Mon, Jun 2, 2025 at 5:22 PM Mario Limonciello wrote: > > > > On 6/2/2025 9:29 AM, Rafael J. Wysocki wrote: > > > On Mon, Jun 2, 2025 at 2:11 PM Chris Bainbridge > > > wrote: > > >> > > >> On Fri, Mar 14, 2025 at 02:13:53PM +0100, Raf

RE: [PATCH v3] drm/ttm: Should to return the evict error

2025-06-03 Thread Deng, Emily
[AMD Official Use Only - AMD Internal Distribution Only] Sorry, send the wrong patch, please ignore this. Emily Deng Best Wishes >-Original Message- >From: Emily Deng >Sent: Tuesday, June 3, 2025 5:10 PM >To: amd-gfx@lists.freedesktop.org >Cc: Deng, Emily >Subject: [PATCH v3] drm/ttm

[PATCH v3] drm/ttm: Should to return the evict error

2025-06-03 Thread Emily Deng
For the evict fail case, the evict error should be returned. v2: Consider ENOENT case. v3: Abort directly when the eviction failed for some reason (except for -ENOENT) and not wait for the move to finish Signed-off-by: Emily Deng --- drivers/gpu/drm/ttm/ttm_resource.c | 3 +++ 1 file changed,

RE: [PATCH v2] drm/amdkfd: Move the process suspend and resume out of full access

2025-06-03 Thread Deng, Emily
[AMD Official Use Only - AMD Internal Distribution Only] Hi Christian and Horace, Could you help to review this? Emily Deng Best Wishes >-Original Message- >From: Zhang, Owen(SRDC) >Sent: Friday, May 30, 2025 5:50 PM >To: Deng, Emily ; amd-gfx@lists.freedesktop.org >Subject: RE:

[PATCH v3] drm/ttm: Should to return the evict error

2025-06-03 Thread Emily Deng
For the evict fail case, the evict error should be returned. v2: Consider ENOENT case. v3: Abort directly when the eviction failed for some reason (except for -ENOENT) and not wait for the move to finish Signed-off-by: Emily Deng --- drivers/gpu/drm/ttm/ttm_resource.c | 3 +++ 1 file changed,

Re: [PATCH v10 3/4] drm/amdgpu: enable pdb0 for hibernation on SRIOV

2025-06-03 Thread Christian König
On 6/3/25 04:19, Samuel Zhang wrote: > When switching to new GPU index after hibernation and then resume, > VRAM offset of each VRAM BO will be changed, and the cached gpu > addresses needed to updated. > > This is to enable pdb0 and switch to use pdb0-based virtual gpu > address by default in amd

RE: [PATCH V8 32/43] drm/colorop: Add 1D Curve Custom LUT type

2025-06-03 Thread Shankar, Uma
> -Original Message- > From: Pekka Paalanen > Sent: Friday, May 30, 2025 7:28 PM > To: Shankar, Uma > Cc: Simon Ser ; Harry Wentland > ; Alex Hung ; dri- > de...@lists.freedesktop.org; amd-gfx@lists.freedesktop.org; intel- > g...@lists.freedesktop.org; wayland-de...@lists.freedesktop.o

Re: [PATCH v3 2/5] PM: sleep: Suspend async parents after suspending children

2025-06-03 Thread Chris Bainbridge
On Fri, Mar 14, 2025 at 02:13:53PM +0100, Rafael J. Wysocki wrote: > From: Rafael J. Wysocki > > In analogy with the previous change affecting the resume path, > make device_suspend() start the async suspend of the device's parent > after the device itself has been processed and make dpm_suspend(

Re: [PATCH 0/3] Handle aborted suspend better

2025-06-03 Thread Chris Bainbridge
On Sun, Jun 01, 2025 at 08:44:29PM -0500, Mario Limonciello wrote: > From: Mario Limonciello > > Chris Bainbridge reported some list corruption occurring around the > suspend sequence when an aborted suspend occurs. > > I couldn't reproduce this specific problem, but when I tried I found > some

Re: [PATCH 07/28] drm/amdgpu: track ring state associated with a job

2025-06-03 Thread Christian König
On 6/3/25 00:42, Alex Deucher wrote: > On Mon, Jun 2, 2025 at 10:36 AM Christian König > wrote: >> >> On 5/29/25 22:07, Alex Deucher wrote: >>> We need to know the wptr and sequence number associated >>> with a job so that we can re-emit the unprocessed state >>> after a ring reset. Pre-allocate