Hi, Tvrtko

On 03.05.2022 15:44, Tvrtko Ursulin wrote:
> 
> Hi,
> 
> On 29/04/2022 16:11, Adrian Larumbe wrote:
> > I915_PARAM_HAS_CONTEXT_ISOLATION was already being used as a boolean by
> > both Iris and Vulkan , and stood for the guarantee that, when creating a
> > new context, all state set by it will not leak to any other context.
> > 
> > However the actual return value was a bitmask where every bit stood for an
> > initialised engine, and IGT test gem_ctx_isolation makes use of this mask
> > for deciding on the actual context engine isolation status.
> > 
> > However, we do not provide UAPI for IGT tests, so the value returned by the
> > PARAM ioctl has to reflect Mesa usage as a boolean.
> 
> a)
> I suggest splitting into two patches. One changes the semantics to boolean, 
> second one changes it for RCS+CCS.
> 
> b)
> What about test coverage - both for platforms with RCS+CSS (media and blitter 
> stop being tested - all coverage is gone basically) and also for pre Gen8 
> platforms, are there failures expected there? (Test will try to run on some 
> engines which do not support isolation now, no?)

Do you mean IGT tests? I think gem_ctx_isolation.c has to be rewritten so that
the engine isolation status of the different devices is somehow hard-coded
into the test, perhaps something like the intel_device_info struct in the
kernel.

> This change only made sense after compute engine support was added to the
> driver in commit 944823c9463916dd53f3 ("drm/i915/xehp: Define compute class
> and engine") because no context isolation can be assumed on any device with
> both RCS annd CCS engines.

>Isn't it an arbitrary decision when thinking about other engine classes 
>(media, blitter) on those platforms? Commit message should be clear in that 
>respect.

I think the change was required because the addition of a CCS engine broke
pre-existing assumptions about engine context isolation, but only when
coexisting with an RCS engine in the same device. I wouldn't know about other
engines, but I'll ask around.

> Also, looking at iris:
> 
>    if (iris_getparam_integer(fd, I915_PARAM_HAS_CONTEXT_ISOLATION) <= 0) {
>       debug_error("Kernel is too old for Iris. Consider upgrading to kernel 
> v4.16.\n");
>       return NULL;
>    }
> 
> Won't this make Iris fail on RCS+CCS platforms - or I need to look at a newer 
> branch/pull request? What is the plan there?

Yes, I think I misread this code and didn't realise, when there isn't context
isolation, this snippet will fail. However, given the semantics of it, which I
glean from the error message between the brackets, I'd say context isolation not
being present shouldn't cause it to fail. So I guess it could be rewritten as

    if (iris_getparam_integer(fd, I915_PARAM_HAS_CONTEXT_ISOLATION) < 0) {

> Signed-off-by: Adrian Larumbe <adrian.laru...@collabora.com>
> ---
>   drivers/gpu/drm/i915/gt/intel_engine_user.c | 13 ++++++++++++-
>   drivers/gpu/drm/i915/gt/intel_engine_user.h |  1 +
>   drivers/gpu/drm/i915/i915_drm_client.h      |  2 +-
>   drivers/gpu/drm/i915/i915_getparam.c        |  2 +-
>   include/uapi/drm/i915_drm.h                 | 14 +++-----------
>   5 files changed, 18 insertions(+), 14 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_user.c 
> b/drivers/gpu/drm/i915/gt/intel_engine_user.c
> index 0f6cd96b459f..2d6bd36d6150 100644
> --- a/drivers/gpu/drm/i915/gt/intel_engine_user.c
> +++ b/drivers/gpu/drm/i915/gt/intel_engine_user.c
> @@ -47,7 +47,7 @@ static const u8 uabi_classes[] = {
>       [COPY_ENGINE_CLASS] = I915_ENGINE_CLASS_COPY,
>       [VIDEO_DECODE_CLASS] = I915_ENGINE_CLASS_VIDEO,
>       [VIDEO_ENHANCEMENT_CLASS] = I915_ENGINE_CLASS_VIDEO_ENHANCE,
> -     /* TODO: Add COMPUTE_CLASS mapping once ABI is available */
> +     [COMPUTE_CLASS] = I915_ENGINE_CLASS_COMPUTE,

> > I think this landed so you will need to rebase.
> 
> >   };
> >   static int engine_cmp(void *priv, const struct list_head *A,
> > @@ -306,3 +306,14 @@ unsigned int 
> > intel_engines_has_context_isolation(struct drm_i915_private *i915)
> >     return which;
> >   }
> > +
> > +bool intel_cross_engine_isolated(struct drm_i915_private *i915)
> 
> Naming feels lackluster (Intel what?). Do you expect other callers or could 
> just implement the check in i915_getparam.c, inside the switch statement?

No other callers, so I guess I should do like you suggested below, handle the 
logic straightaway from i915_getparam_ioctl.

> +{
> +     unsigned int which = intel_engines_has_context_isolation(i915);
> +
> +     if ((which & BIT(I915_ENGINE_CLASS_RENDER)) &&
> +         (which & BIT(I915_ENGINE_CLASS_COMPUTE)))
> +             return false;
> +
> +     return !!which;

> You could first just check if there are any RCS and CCS engines (for instance 
> i915->engine_uabi_class_count[], or RCS/CCS_MASK()).
> 
>   /* Comment here to explain the decision. */
>   if (RCS_MASK(&i915->gt) | CCS_MASK(&i915->gt))
>       value = 0;
>   else
>         value = !!intel_engines_has_context_isolation(i915);
> 
> ?
> 
> There also may be little point in even calling 
> intel_engines_has_context_isolation, when the desired output is a boolean and 
> could just make it feature or Gen based. Decision for later though, after 
> some clarifications.

> +}
> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_user.h 
> b/drivers/gpu/drm/i915/gt/intel_engine_user.h
> index 3dc7e8ab9fbc..ff21349db4d4 100644
> --- a/drivers/gpu/drm/i915/gt/intel_engine_user.h
> +++ b/drivers/gpu/drm/i915/gt/intel_engine_user.h
> @@ -15,6 +15,7 @@ struct intel_engine_cs *
>   intel_engine_lookup_user(struct drm_i915_private *i915, u8 class, u8 
> instance);
>   unsigned int intel_engines_has_context_isolation(struct drm_i915_private 
> *i915);
> +bool intel_cross_engine_isolated(struct drm_i915_private *i915);
>   void intel_engine_add_user(struct intel_engine_cs *engine);
>   void intel_engines_driver_register(struct drm_i915_private *i915);
> diff --git a/drivers/gpu/drm/i915/i915_drm_client.h 
> b/drivers/gpu/drm/i915/i915_drm_client.h
> index 5f5b02b01ba0..f796c5e8e060 100644
> --- a/drivers/gpu/drm/i915/i915_drm_client.h
> +++ b/drivers/gpu/drm/i915/i915_drm_client.h
> @@ -13,7 +13,7 @@
>   #include "gt/intel_engine_types.h"
> -#define I915_LAST_UABI_ENGINE_CLASS I915_ENGINE_CLASS_VIDEO_ENHANCE
> +#define I915_LAST_UABI_ENGINE_CLASS I915_ENGINE_CLASS_COMPUTE
>   struct drm_i915_private;
> diff --git a/drivers/gpu/drm/i915/i915_getparam.c 
> b/drivers/gpu/drm/i915/i915_getparam.c
> index c12a0adefda5..3d5120d2d78a 100644
> --- a/drivers/gpu/drm/i915/i915_getparam.c
> +++ b/drivers/gpu/drm/i915/i915_getparam.c
> @@ -145,7 +145,7 @@ int i915_getparam_ioctl(struct drm_device *dev, void 
> *data,
>               value = 1;
>               break;
>       case I915_PARAM_HAS_CONTEXT_ISOLATION:
> -             value = intel_engines_has_context_isolation(i915);
> +             value = intel_cross_engine_isolated(i915);
>               break;
>       case I915_PARAM_SLICE_MASK:
>               value = sseu->slice_mask;
> diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
> index 35ca528803fd..84c0af77cc1f 100644
> --- a/include/uapi/drm/i915_drm.h
> +++ b/include/uapi/drm/i915_drm.h
> @@ -166,6 +166,7 @@ enum drm_i915_gem_engine_class {
>       I915_ENGINE_CLASS_COPY          = 1,
>       I915_ENGINE_CLASS_VIDEO         = 2,
>       I915_ENGINE_CLASS_VIDEO_ENHANCE = 3,
> +     I915_ENGINE_CLASS_COMPUTE       = 4,
>       /* should be kept compact */
> @@ -635,17 +636,8 @@ typedef struct drm_i915_irq_wait {
>   #define I915_PARAM_HAS_EXEC_FENCE_ARRAY  49
>   /*
> - * Query whether every context (both per-file default and user created) is
> - * isolated (insofar as HW supports). If this parameter is not true, then
> - * freshly created contexts may inherit values from an existing context,
> - * rather than default HW values. If true, it also ensures (insofar as HW
> - * supports) that all state set by this context will not leak to any other
> - * context.
> - *
> - * As not every engine across every gen support contexts, the returned
> - * value reports the support of context isolation for individual engines by
> - * returning a bitmask of each engine class set to true if that class 
> supports
> - * isolation.
> + * Query whether the device can make cross-engine isolation guarantees for
> + * all the engines whose default state has been initialised.

> "Engine whose default state has been initialised" does not sound very helpful 
> for userspace developers. Like what determines if that has happened or not, 
> and the fact userspace is more about context state rather than engine state.
> 
> Existing text which talked about engines supporting contexts and not leaking 
> state sounded better TBF. So overall I think you deleted too much of the 
> text. Can't you just change the ending, from the point where it goes into 
> individual engines and bitmap, replacing with the new explanation?

I made a mistake here. I assumed mentioning this would be relevant because
engine isolation is only checked for engines whose default state had been
assigned in __engines_record_defaults, but like you said this is a KM detail
userspace developers shouldn't concern themselves with, so I'll restore that
part of the previous text.

> Regards,
> 
> Tvrtko

>    */
>   #define I915_PARAM_HAS_CONTEXT_ISOLATION 50

Kind Regards,
Adrian Larumbe

Reply via email to