On 18/10/2022 23:15, Vinay Belgaumkar wrote:
Waitboost (when SLPC is enabled) results in a H2G message. This can result
in thousands of messages during a stress test and fill up an already full
CTB. There is no need to request for RP0 if GuC is already requesting the
same.

Signed-off-by: Vinay Belgaumkar <vinay.belgaum...@intel.com>
---
  drivers/gpu/drm/i915/gt/intel_rps.c | 9 ++++++++-
  1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_rps.c 
b/drivers/gpu/drm/i915/gt/intel_rps.c
index fc23c562d9b2..a20ae4fceac8 100644
--- a/drivers/gpu/drm/i915/gt/intel_rps.c
+++ b/drivers/gpu/drm/i915/gt/intel_rps.c
@@ -1005,13 +1005,20 @@ void intel_rps_dec_waiters(struct intel_rps *rps)
  void intel_rps_boost(struct i915_request *rq)
  {
        struct intel_guc_slpc *slpc;
+       struct intel_rps *rps = &READ_ONCE(rq->engine)->gt->rps;
if (i915_request_signaled(rq) || i915_request_has_waitboost(rq))
                return;
+ /* If GuC is already requesting RP0, skip */
+       if (rps_uses_slpc(rps)) {
+               slpc = rps_to_slpc(rps);
+               if (intel_rps_get_requested_frequency(rps) == slpc->rp0_freq)
+                       return;
+       }
+

Feels a little bit like a layering violation. Wait boost reference counts and request markings will changed based on asynchronous state - a mmio read.

Also, a little below we have this:

"""
        /* Serializes with i915_request_retire() */
        if (!test_and_set_bit(I915_FENCE_FLAG_BOOST, &rq->fence.flags)) {
                struct intel_rps *rps = &READ_ONCE(rq->engine)->gt->rps;

                if (rps_uses_slpc(rps)) {
                        slpc = rps_to_slpc(rps);

                        /* Return if old value is non zero */
                        if (!atomic_fetch_inc(&slpc->num_waiters))

***>>>> Wouldn't it skip doing anything here already? <<<<***

                                schedule_work(&slpc->boost_work);

                        return;
                }

                if (atomic_fetch_inc(&rps->num_waiters))
                        return;
"""

But I wonder if this is not a layering violation already. Looks like one for me at the moment. And as it happens there is an ongoing debug of clvk slowness where I was a bit puzzled by the lack of "boost fence" in trace_printk logs - but now I see how that happens. Does not feel right to me that we lose that tracing with SLPC.

So in general - why the correct approach wouldn't be to solve this in the worker - which perhaps should fork to slpc specific branch and do the consolidations/skips based on mmio reads in there?

Regards,

Tvrtko

        /* Serializes with i915_request_retire() */
        if (!test_and_set_bit(I915_FENCE_FLAG_BOOST, &rq->fence.flags)) {
-               struct intel_rps *rps = &READ_ONCE(rq->engine)->gt->rps;
if (rps_uses_slpc(rps)) {
                        slpc = rps_to_slpc(rps);

Reply via email to