On Thu, Mar 27, 2025 at 10:37 PM Feng, Kenneth <kenneth.f...@amd.com> wrote: > > [AMD Official Use Only - AMD Internal Distribution Only] > > -----Original Message----- > From: Alex Deucher <alexdeuc...@gmail.com> > Sent: Thursday, March 27, 2025 10:09 PM > To: Feng, Kenneth <kenneth.f...@amd.com> > Cc: amd-gfx@lists.freedesktop.org; Wang, Yang(Kevin) > <kevinyang.w...@amd.com>; Deucher, Alexander <alexander.deuc...@amd.com>; > Wentland, Harry <harry.wentl...@amd.com> > Subject: Re: [PATCH] drm/amd/display: port the workload profile setting logic > into dm > > Caution: This message originated from an External Source. Use proper caution > when opening attachments, clicking links, or responding. > > > On Thu, Mar 27, 2025 at 4:22 AM Feng, Kenneth <kenneth.f...@amd.com> wrote: > > > > [AMD Official Use Only - AMD Internal Distribution Only] > > > > -----Original Message----- > > From: Alex Deucher <alexdeuc...@gmail.com> > > Sent: Wednesday, March 26, 2025 11:08 PM > > To: Feng, Kenneth <kenneth.f...@amd.com> > > Cc: amd-gfx@lists.freedesktop.org; Wang, Yang(Kevin) > > <kevinyang.w...@amd.com>; Deucher, Alexander > > <alexander.deuc...@amd.com>; Wentland, Harry <harry.wentl...@amd.com> > > Subject: Re: [PATCH] drm/amd/display: port the workload profile > > setting logic into dm > > > > Caution: This message originated from an External Source. Use proper > > caution when opening attachments, clicking links, or responding. > > > > > > On Wed, Mar 26, 2025 at 1:22 AM Kenneth Feng <kenneth.f...@amd.com> wrote: > > > > > > Port the workload profile setting logic into dm before MALL optimization. > > > > > > Background: > > > MALL optimization strategy has changed in the firmware.Previously, > > > firmware does not care what workload type it is, once there is a > > > request from DAL for MALL, firmware immediately trigger the MALL setting > > > sequence on the SoC, so called D0i3.x idle power sequence. > > > Now, before the D0i3.x sequence starts, firmware always check if the > > > workload type is default, if it is not, then abort the D0i3.x sequence. > > > > > > Issue: > > > Due to this strategy change, the task is moved to driver to make > > > sure if gfx is really idle and if it is, reset the workload to default. > > > Without this task, when DAL's work task for MALL optimization tries > > > to do the optimization request to DMCUB->pmfw, the workload type is > > > always 3D fullscreen or compute, then MALL will never be applied. > > > > > > Why: > > > The idle task for setting workload type back to default interval is > > > 1 second currently. The DAL's work task to optimize MALL always > > > starts before the idle task for setting workload type back to > > > default. There is no way to ask the idle task in the base driver to > > > reset the workload type ahead of the DAL's MALL setting work task kicks > > > off. There could be a workaround which sets the idle task interval to 10 > > > millisecond. However, this causes some call trace issues in which the > > > workqueues is flushed. > > > > That should already fixed by this commit: > > https://web.git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git > > /commit/?id=de35994ecd2dd6148ab5a6c5050a1670a04dec77 > > > > > > > > Side effect: > > > This solution is to port the logic in idle thread to DAL: check the > > > fence and make sure gfx is idle, then reset the workload type. It is > > > fine that when DAL's work task exits the MALL optimization, it does not > > > set back the workload type to 3d fullscreen or compute since the work > > > task in base driver can make sure the workload type can be set back once > > > there are jobs in the ring. > > > > I guess this works because the workload ref count gets clamped to 0, but > > this call is not balanced. It also doesn't handle the VIDEO profile that > > gets set when using VCN or the COMPUTE profile when KFD queues are active. > > Those would also prevent the idle optimizations. > > Also what happens if the profile changes after DC optimizations are > > enabled? Does that cause the optimizations to exit or will they stay > > intact until DC tells it to exit? > > > > So I think we have two options: > > 1. always disable the 3D, compute, video profiles when entering the DAL > > optimization. subsequently, additional job submissions may change the > > workload. will that be a problem? > > 2. Add a helper to pause workload profiles while the DC optimization is > > active. If the profile only has to be changed while enabling the DC > > optimization, we can just call it right before and right after the dc > > optimizations. Something like the attached patches should be a good > > starting point. > > > > Alex > > > > Hi Alex, > > In the attached patch, I guess smu->pause_workload is not needed since > > workload can be switched after idle optimization is triggered. And > > smu_pause_power_profile may not need the bool flag since I still think that > > we don't need to take care of the imbalance of workload setting. > > That is said, we just need to set the workload to default when idle > > optimization is requested. When the idle optimization is cancelled from > > DAL, it doesn't need to restore the previous workload setting since the > > ring work will set it. > > Attached is my patch. It works on my system. Let me know your thoughts. > > This will race with any command submissions unless you add a lock. I think > you want something like: > > /* switch to the bootup default profile */ > amdgpu_dpm_pause_power_profile(adev, true); > dc_allow_idle_optimizations(dm->dc, true); > /* resume existing profiles */ > amdgpu_dpm_pause_power_profile(adev, false); > > > Alex > > > Thanks. > > > > Kenneth > > > It works with your two patches, plus the attached one. Thanks. > By the way, do you think we need a mutex lock when accessing > smu->pause_workload?
I think the pm.mutex handles this case. Alex > Kenneth > > > > > > > > Signed-off-by: Kenneth Feng <kenneth.f...@amd.com> > > > --- > > > .../amd/display/amdgpu_dm/amdgpu_dm_crtc.c | 29 ++++++++++++++++++- > > > 1 file changed, 28 insertions(+), 1 deletion(-) > > > > > > diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_crtc.c > > > b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_crtc.c > > > index 36a830a7440f..2adb3b72ed05 100644 > > > --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_crtc.c > > > +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_crtc.c > > > @@ -244,6 +244,20 @@ static void > > > amdgpu_dm_crtc_vblank_control_worker(struct work_struct *work) > > > struct vblank_control_work *vblank_work = > > > container_of(work, struct vblank_control_work, work); > > > struct amdgpu_display_manager *dm = vblank_work->dm; > > > + u32 i, fences = 0; > > > + int r; > > > + enum PP_SMC_POWER_PROFILE profile; > > > + struct amdgpu_device *adev = drm_to_adev(dm->ddev); > > > + > > > + if (adev->gfx.num_gfx_rings) > > > + profile = PP_SMC_POWER_PROFILE_FULLSCREEN3D; > > > + else > > > + profile = PP_SMC_POWER_PROFILE_COMPUTE; > > > + > > > + for (i = 0; i < AMDGPU_MAX_GFX_RINGS; ++i) > > > + fences += > > > amdgpu_fence_count_emitted(&adev->gfx.gfx_ring[i]); > > > + for (i = 0; i < (AMDGPU_MAX_COMPUTE_RINGS * > > > AMDGPU_MAX_GC_INSTANCES); ++i) > > > + fences += > > > + amdgpu_fence_count_emitted(&adev->gfx.compute_ring[i]); > > > > > > mutex_lock(&dm->dc_lock); > > > > > > @@ -271,8 +285,21 @@ static void > > > amdgpu_dm_crtc_vblank_control_worker(struct work_struct *work) > > > vblank_work->acrtc->dm_irq_params.allow_sr_entry); > > > } > > > > > > - if (dm->active_vblank_irq_count == 0) > > > + if (dm->active_vblank_irq_count == 0) { > > > + if (adev->gfx.num_gfx_rings && !fences && > > > !atomic_read(&adev->gfx.total_submission_cnt)) { > > > + mutex_lock(&adev->gfx.workload_profile_mutex); > > > + if (adev->gfx.workload_profile_active) { > > > + r = amdgpu_dpm_switch_power_profile(adev, > > > profile, false); > > > + if (r) > > > + dev_warn(adev->dev, "(%d) failed to > > > disable %s power profile mode\n", r, > > > + > > > profile == PP_SMC_POWER_PROFILE_FULLSCREEN3D ? > > > + > > > "fullscreen 3D" : "compute"); > > > + adev->gfx.workload_profile_active = false; > > > + } > > > + mutex_unlock(&adev->gfx.workload_profile_mutex); > > > + } > > > dc_allow_idle_optimizations(dm->dc, true); > > > + } > > > > > > mutex_unlock(&dm->dc_lock); > > > > > > -- > > > 2.34.1 > > >