On 3/31/26 11:20, Thomas Hellström wrote:
> The xe driver was using the drm_exec retry pointer directly to
> restart the locking loop after out-of-memory errors. This is
> relying on documented behaviour.
> 
> Instead add a drm_exec_retry() macro that can be used in this
> situation, and that also asserts that the struct drm_exec is
> in a state that is compatible with retrying:
> Either newly initialized or in a contended state with all locks
> dropped.
> 
> Use that macro in xe.
> 
> Signed-off-by: Thomas Hellström <[email protected]>
> ---
>  drivers/gpu/drm/xe/xe_validation.h |  2 +-
>  include/drm/drm_exec.h             | 13 +++++++++++++
>  2 files changed, 14 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/xe/xe_validation.h 
> b/drivers/gpu/drm/xe/xe_validation.h
> index a30e732c4d51..4cd955ce6cd2 100644
> --- a/drivers/gpu/drm/xe/xe_validation.h
> +++ b/drivers/gpu/drm/xe/xe_validation.h
> @@ -146,7 +146,7 @@ bool xe_validation_should_retry(struct xe_validation_ctx 
> *ctx, int *ret);
>  #define xe_validation_retry_on_oom(_ctx, _ret)                               
> \
>       do {                                                            \
>               if (xe_validation_should_retry(_ctx, _ret))             \
> -                     goto *__drm_exec_retry_ptr;                     \
> +                     drm_exec_retry((_ctx)->exec);                   \

Oh, that goto is extremely questionable to begin with.

>       } while (0)
>  
>  /**
> diff --git a/include/drm/drm_exec.h b/include/drm/drm_exec.h
> index fc95a979e253..5ed5be1f8244 100644
> --- a/include/drm/drm_exec.h
> +++ b/include/drm/drm_exec.h
> @@ -138,6 +138,19 @@ static inline bool drm_exec_is_contended(struct drm_exec 
> *exec)
>       return !!exec->contended;
>  }
>  
> +/**
> + * drm_exec_retry() - Unconditionally restart the loop to grab all locks.
> + * @exec: drm_exec object
> + *
> + * Unconditionally retry the loop to lock all objects. For consistency,
> + * the exec object needs to be newly initialized or contended.
> + */
> +#define drm_exec_retry(_exec)                                \
> +     do {                                            \
> +             WARN_ON(!drm_exec_is_contended(_exec)); \

This warning would trigger!

See the code in xe_bo_notifier_prepare_pinned() for example:

                        drm_exec_retry_on_contention(&exec);
                        ret = PTR_ERR(backup);
                        xe_validation_retry_on_oom(&ctx, &ret);

Without contention we would just skip the loop and never lock anything.

What XE does here just doesn't work as far as I can see.

Regards,
Christian.

> +             goto *__drm_exec_retry_ptr;             \
> +     } while (0)
> +
>  void drm_exec_init(struct drm_exec *exec, u32 flags, unsigned nr);
>  void drm_exec_fini(struct drm_exec *exec);
>  bool drm_exec_cleanup(struct drm_exec *exec);

Reply via email to