On Mon, 20 Mar 2023, Andi Shyti <andi.sh...@linux.intel.com> wrote:
> From: Matt Roper <matthew.d.ro...@intel.com>
>
> We occasionally see the PCI device in a non-accessible state at the
> point the driver is loaded.  When this happens, all BAR accesses will
> read back as 0xFFFFFFFF.  Rather than reading registers and
> misinterpreting their (invalid) values, let's specifically check for
> 0xFFFFFFFF in a register that cannot have that value to see if the
> device is accessible.
>
> Signed-off-by: Matt Roper <matthew.d.ro...@intel.com>
> Cc: Mika Kuoppala <mika.kuopp...@linux.intel.com>
> Signed-off-by: Andi Shyti <andi.sh...@linux.intel.com>
> ---
>  drivers/gpu/drm/i915/intel_uncore.c | 34 +++++++++++++++++++++++++++++
>  1 file changed, 34 insertions(+)
>
> diff --git a/drivers/gpu/drm/i915/intel_uncore.c 
> b/drivers/gpu/drm/i915/intel_uncore.c
> index e1e1f34490c8e..14ec45e6facfa 100644
> --- a/drivers/gpu/drm/i915/intel_uncore.c
> +++ b/drivers/gpu/drm/i915/intel_uncore.c
> @@ -2602,11 +2602,45 @@ static int uncore_forcewake_init(struct intel_uncore 
> *uncore)
>       return 0;
>  }
>  
> +static int sanity_check_mmio_access(struct intel_uncore *uncore)
> +{
> +     struct drm_i915_private *i915 = uncore->i915;
> +
> +     if (GRAPHICS_VER(i915) < 8)
> +             return 0;
> +
> +     /*
> +      * Sanitycheck that MMIO access to the device is working properly.  If
> +      * the CPU is unable to communcate with a PCI device, BAR reads will
> +      * return 0xFFFFFFFF.  Let's make sure the device isn't in this state
> +      * before we start trying to access registers.
> +      *
> +      * We use the primary GT's forcewake register as our guinea pig since
> +      * it's been around since HSW and it's a masked register so the upper
> +      * 16 bits can never read back as 1's if device access is operating
> +      * properly.
> +      *
> +      * If MMIO isn't working, we'll wait up to 2 seconds to see if it
> +      * recovers, then give up.
> +      */
> +#define COND (__raw_uncore_read32(uncore, FORCEWAKE_MT) != ~0)
> +     if (wait_for(COND, 2000) == -ETIMEDOUT) {

I guess this somewhat reimplements intel_wait_for_register_fw()?

> +             drm_err(&i915->drm, "Device is non-operational; MMIO access 
> returns 0xFFFFFFFF!\n");
> +             return -EIO;
> +     }
> +
> +     return 0;
> +}
> +
>  int intel_uncore_init_mmio(struct intel_uncore *uncore)
>  {
>       struct drm_i915_private *i915 = uncore->i915;
>       int ret;
>  
> +     ret = sanity_check_mmio_access(uncore);
> +     if (ret)
> +             return ret;
> +
>       /*
>        * The boot firmware initializes local memory and assesses its health.
>        * If memory training fails, the punit will have been instructed to

-- 
Jani Nikula, Intel Open Source Graphics Center

Reply via email to