Here is the impact: Ryzen 5 2500U Idle PkgWatt: 1.3W -> 0.75W glxgears fps: 7200 -> 4500 aquarium 15,000 fish: 56-60 -> 45-50
These results are for gfx_off_ctrl_immediate() https://github.com/pacoandres/laikm/issues/56 https://github.com/pacoandres/laikm/issues/47 Same results are for default zero GFXOFF delay: https://github.com/pacoandres/laikm/issues/74 On Mon, Mar 24, 2025 at 5:18 PM Alex Deucher <alexdeuc...@gmail.com> wrote: > > On Mon, Mar 24, 2025 at 11:14 AM Sergey Kovalenko > <seryoga.engineer...@gmail.com> wrote: > > > > "Moreover the work only gets scheduled/cancelled > > around a ref count of 0" - and this happens more than 600 times > > per second under load, as you can see from the table. > > Moreover these 600 GFXOFF requests are executed with a very > > little interval between them. If we set GFXOFF delay to 0 by > > default, it will be enabled and disabled 600 times with pretty > > significant performance drop in any test. > > Can you show me the impact? > > > Keeping 100ms GFXOFF delay leads to 2 times higher > > power consumption in idle, nicely noticeable for laptops. > > That will be fixed in kernel 6.15 which adds the > gfx_off_ctrl_immediate() patches. > > Alex > > > > > I don't think this is the behavior the users want to see. > > Workarounds introduced a significant overhead, and > > to mitigate it there is a simple solution to limit the amount > > of those calls. > > > > Regards, > > Sergey > > > > On Mon, Mar 24, 2025 at 4:50 PM Alex Deucher <alexdeuc...@gmail.com> wrote: > > > > > > On Mon, Mar 24, 2025 at 10:34 AM Sergey Kovalenko > > > <seryoga.engineer...@gmail.com> wrote: > > > > > > > > Hello Alex! > > > > > > > > "If there are a lot of requests to toggle gfxoff, the worker thread to > > > > allow it again gets > > > > cancelled and scheduled again, extending the time it's disallowed." - > > > > That's true except one thing: > > > > cancelling and scheduling also take CPU cycles, and the pause between > > > > submissions can exceed > > > > 100ms interval, which leads to GFXOFF being enabled even if the GPU is > > > > loaded. > > > > This simple prediction algorithm eliminates such cases. > > > > GFXOFF is enabled immediately when there are only a few submissions > > > > ongoing, which > > > > means the system is idling. And when the number of compute submissions > > > > exceeds the > > > > range, the algorithm chooses a longer delay. Finally it simply ignores > > > > GFXOFF ON requests > > > > to prevent performance drops under high load. > > > > This allows keeping idle power consumption low and using stable GPU > > > > performance under load. > > > > > > What performance drops? Your data doesn't show any. Moreover the > > > work only gets scheduled/cancelled around a ref count of 0. Plus, your > > > change introduces a bunch of additional calculations for this function > > > which add overhead which you are concerned about. > > > > > > Alex > > > > > > > > > > > Regards, > > > > Sergey > > > > > > > > On Mon, Mar 24, 2025 at 4:14 PM Alex Deucher <alexdeuc...@gmail.com> > > > > wrote: > > > > > > > > > > On Mon, Mar 24, 2025 at 5:06 AM Sergey Kovalenko > > > > > <seryoga.engineer...@gmail.com> wrote: > > > > > > > > > > > > Predict an optimal delay to enable GFXOFF for the next interval > > > > > > based on the request count: > > > > > > - less than 15 requests per second - zero delay > > > > > > - less than 25 requests per second - default delay > > > > > > - 25 and more requests per second - don't enable GFXOFF > > > > > > > > > > > > The algorithm allows maintaining low power consumption in idle, > > > > > > as well as using the full GPU power under load by eliminating > > > > > > hundreds of extra GFXOFF ON/OFF switches. > > > > > > > > > > I still don't understand what problem this is solving. This already > > > > > happens with the way the code works now. If there are a lot of > > > > > requests to toggle gfxoff, the worker thread to allow it again gets > > > > > cancelled and scheduled again, extending the time it's disallowed. > > > > > > > > > > Alex > > > > > > > > > > > > > > > Alex > > > > > > > > > > > > > > > > > Test configuration: > > > > > > - Ryzen 5 2500U > > > > > > - Ryzen 5 3400G > > > > > > - Chromium 134.0.6998.88 Arch Linux > > > > > > - Mesa 1:24.3.4-1 > > > > > > - KDE Plasma 6.3.2 > > > > > > > > > > > > GFXOFF enable requests per second: > > > > > > | Test | min | max | > > > > > > |-----------------------------------------------------|-----|-----| > > > > > > | System idle | 0 | 64 | > > > > > > | Web browsing | 0 | 127 | > > > > > > | WebGL aquarium | 10 | 236 | > > > > > > | Heavy load: glxgears + vkcube + resizing + flipping | 39 | 677 | > > > > > > > > > > > > Test results, Ryzen 5 2500U: > > > > > > | Test | patched-6.13.7.arch1 | > > > > > > 6.13.7.arch1-1 | > > > > > > |-----------------------------|----------------------|------------------| > > > > > > | System idle (PkgWatt) | ~0.74W | > > > > > > ~1.25W | > > > > > > | glxgears (vblank_mode=0) | ~7300 fps, ~7.3W | ~7300 fps, > > > > > > ~7.3W | > > > > > > | WebGL aquarium 15.000 fish | 56-60 fps, ~9.8W | 55-60 fps, > > > > > > ~9.8W | > > > > > > > > > > > > Test results, Ryzen 5 3400G: > > > > > > | Test | patched-6.13.7.arch1 | > > > > > > 6.13.7.arch1-1 | > > > > > > |-----------------------------|----------------------|------------------| > > > > > > | System idle (PkgWatt) | ~3.8W | > > > > > > ~4.3W | > > > > > > | glxgears (vblank_mode=0) | ~7200 fps | ~7200 > > > > > > fps | > > > > > > | WebGL aquarium 30.000 fish | 37 fps, 47W | 38 fps, > > > > > > 47W | > > > > > > > > > > > > Signed-off-by: Sergey Kovalenko <seryoga.engineer...@gmail.com> > > > > > > Tested-by: Liam Fleming <fleming.squa...@proton.me> > > > > > > --- > > > > > > drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c | 94 > > > > > > +++++++++++++++++-------- > > > > > > drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h | 3 + > > > > > > 2 files changed, 67 insertions(+), 30 deletions(-) > > > > > > > > > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c > > > > > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c > > > > > > index c1f35ded684e..5e23b956e0bf 100644 > > > > > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c > > > > > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c > > > > > > @@ -781,55 +781,89 @@ int amdgpu_gfx_enable_kgq(struct amdgpu_device > > > > > > *adev, int xcc_id) > > > > > > * 3. other client can cancel their request of disable gfx off > > > > > > feature > > > > > > * 4. other client should not send request to enable gfx off > > > > > > feature > > > > > > before disable gfx off feature. > > > > > > */ > > > > > > - > > > > > > void amdgpu_gfx_off_ctrl(struct amdgpu_device *adev, bool enable) > > > > > > { > > > > > > - unsigned long delay = GFX_OFF_DELAY_ENABLE; > > > > > > - > > > > > > if (!(adev->pm.pp_feature & PP_GFXOFF_MASK)) > > > > > > return; > > > > > > > > > > > > mutex_lock(&adev->gfx.gfx_off_mutex); > > > > > > > > > > > > if (enable) { > > > > > > - /* If the count is already 0, it means there's an > > > > > > imbalance bug > > > > > > somewhere. > > > > > > - * Note that the bug may be in a different caller > > > > > > than the one which > > > > > > triggers the > > > > > > - * WARN_ON_ONCE. > > > > > > - */ > > > > > > - if (WARN_ON_ONCE(adev->gfx.gfx_off_req_count == 0)) > > > > > > + /* This case covers multiple calls from parallel > > > > > > threads */ > > > > > > + if (!adev->gfx.gfx_off_req_count) > > > > > > goto unlock; > > > > > > > > > > > > - adev->gfx.gfx_off_req_count--; > > > > > > + /* Process only if req_count == 0 and GFXOFF is > > > > > > disabled */ > > > > > > + if (--adev->gfx.gfx_off_req_count || > > > > > > adev->gfx.gfx_off_state) > > > > > > + goto unlock; > > > > > > + > > > > > > + /* If going to s2idle, no need to wait */ > > > > > > + if (adev->in_s0ix) { > > > > > > + if (!amdgpu_dpm_set_powergating_by_smu( > > > > > > + adev, AMD_IP_BLOCK_TYPE_GFX, > > > > > > true, 0)) > > > > > > + adev->gfx.gfx_off_state = true; > > > > > > + > > > > > > + /* Reset delay flag */ > > > > > > + adev->gfx.gfx_off_use_delay = 0; > > > > > > + goto unlock; > > > > > > + } > > > > > > > > > > > > - if (adev->gfx.gfx_off_req_count == 0 && > > > > > > - !adev->gfx.gfx_off_state) { > > > > > > - /* If going to s2idle, no need to wait */ > > > > > > - if (adev->in_s0ix) { > > > > > > - if > > > > > > (!amdgpu_dpm_set_powergating_by_smu(adev, > > > > > > - > > > > > > AMD_IP_BLOCK_TYPE_GFX, true, 0)) > > > > > > - adev->gfx.gfx_off_state = > > > > > > true; > > > > > > + ++adev->gfx.gfx_off_counter; > > > > > > + > > > > > > + uint64_t now = get_jiffies_64(); > > > > > > + uint64_t delta = > > > > > > + jiffies_to_msecs(now - > > > > > > adev->gfx.gfx_off_timestamp); > > > > > > + > > > > > > + if (delta >= 1000u) { > > > > > > + /* > > > > > > + * Predict the optimal delay for the next > > > > > > interval > > > > > > + * based on the current number of requests: > > > > > > + * <15 - idle, no delay > > > > > > + * <25 - light/medium load, default delay > > > > > > + * 25 and more - high load, GFXOFF disabled > > > > > > + */ > > > > > > + if (adev->gfx.gfx_off_counter < 15u) { > > > > > > + adev->gfx.gfx_off_use_delay = 0; > > > > > > + } else if (adev->gfx.gfx_off_counter < 25u) > > > > > > { > > > > > > + adev->gfx.gfx_off_use_delay = 1; > > > > > > } else { > > > > > > - > > > > > > schedule_delayed_work(&adev->gfx.gfx_off_delay_work, > > > > > > - delay); > > > > > > + adev->gfx.gfx_off_use_delay = 2; > > > > > > } > > > > > > + > > > > > > + adev->gfx.gfx_off_counter = 0; > > > > > > + adev->gfx.gfx_off_timestamp = now; > > > > > > } > > > > > > + > > > > > > + /* Don't schedule gfxoff under heavy load */ > > > > > > + if (adev->gfx.gfx_off_use_delay == 2) > > > > > > + goto unlock; > > > > > > + > > > > > > + schedule_delayed_work(&adev->gfx.gfx_off_delay_work, > > > > > > + adev->gfx.gfx_off_use_delay ? > > > > > > + GFX_OFF_DELAY_ENABLE : > > > > > > + GFX_OFF_NO_DELAY); > > > > > > } else { > > > > > > - if (adev->gfx.gfx_off_req_count == 0) { > > > > > > - > > > > > > cancel_delayed_work_sync(&adev->gfx.gfx_off_delay_work); > > > > > > + /* GFXOFF was enabled when req_count == 0 */ > > > > > > + if (++adev->gfx.gfx_off_req_count != 1) > > > > > > + goto unlock; > > > > > > > > > > > > - if (adev->gfx.gfx_off_state && > > > > > > - > > > > > > !amdgpu_dpm_set_powergating_by_smu(adev, AMD_IP_BLOCK_TYPE_GFX, > > > > > > false, 0)) { > > > > > > - adev->gfx.gfx_off_state = false; > > > > > > + /* Nothing to do if the work wasn't scheduled */ > > > > > > + if (adev->gfx.gfx_off_use_delay == 2) > > > > > > + goto unlock; > > > > > > > > > > > > - if > > > > > > (adev->gfx.funcs->init_spm_golden) { > > > > > > - dev_dbg(adev->dev, > > > > > > - "GFXOFF is > > > > > > disabled, re-init SPM golden settings\n"); > > > > > > - > > > > > > amdgpu_gfx_init_spm_golden(adev); > > > > > > - } > > > > > > + > > > > > > cancel_delayed_work_sync(&adev->gfx.gfx_off_delay_work); > > > > > > + > > > > > > + if (adev->gfx.gfx_off_state && > > > > > > + !amdgpu_dpm_set_powergating_by_smu( > > > > > > + adev, AMD_IP_BLOCK_TYPE_GFX, false, 0)) > > > > > > { > > > > > > + adev->gfx.gfx_off_state = false; > > > > > > + > > > > > > + if (adev->gfx.funcs->init_spm_golden) { > > > > > > + dev_dbg(adev->dev, > > > > > > + "GFXOFF is disabled, > > > > > > re-init SPM golden settings\n"); > > > > > > + amdgpu_gfx_init_spm_golden(adev); > > > > > > } > > > > > > } > > > > > > - > > > > > > - adev->gfx.gfx_off_req_count++; > > > > > > } > > > > > > > > > > > > unlock: > > > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h > > > > > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h > > > > > > index 8b5bd63b5773..38fd445a353b 100644 > > > > > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h > > > > > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h > > > > > > @@ -430,7 +430,10 @@ struct amdgpu_gfx { > > > > > > /* gfx off */ > > > > > > bool gfx_off_state; /* > > > > > > true: enabled, > > > > > > false: disabled */ > > > > > > struct mutex gfx_off_mutex; /* > > > > > > mutex to > > > > > > change gfxoff state */ > > > > > > + uint64_t gfx_off_timestamp; /* > > > > > > gfxoff enable call timestamp */ > > > > > > + uint32_t gfx_off_use_delay; /* flag > > > > > > to choose the delay range */ > > > > > > uint32_t gfx_off_req_count; /* > > > > > > default 1, > > > > > > enable gfx off: dec 1, disable gfx off: add 1 */ > > > > > > + uint32_t gfx_off_counter; /* > > > > > > count of gfxoff enable calls */ > > > > > > struct delayed_work gfx_off_delay_work; /* > > > > > > async work to > > > > > > set gfx block off */ > > > > > > uint32_t gfx_off_residency; /* last > > > > > > logged > > > > > > residency */ > > > > > > uint64_t gfx_off_entrycount; /* > > > > > > count of times > > > > > > GPU has get into GFXOFF state */ > > > > > > -- > > > > > > 2.49.0