[AMD Official Use Only - AMD Internal Distribution Only] > -----Original Message----- > From: Zhou1, Tao <tao.zh...@amd.com> > Sent: Wednesday, April 2, 2025 11:02 PM > To: Skvortsov, Victor <victor.skvort...@amd.com>; > amd-gfx@lists.freedesktop.org > Cc: Zhang, Hawking <hawking.zh...@amd.com>; Zhao, Victor > <victor.z...@amd.com> > Subject: RE: [PATCH] drm/amdgpu: Disable ACA on VFs > > [AMD Official Use Only - AMD Internal Distribution Only] > > > -----Original Message----- > > From: Skvortsov, Victor <victor.skvort...@amd.com> > > Sent: Thursday, April 3, 2025 6:16 AM > > To: amd-gfx@lists.freedesktop.org > > Cc: Zhang, Hawking <hawking.zh...@amd.com>; Zhao, Victor > > <victor.z...@amd.com>; Zhou1, Tao <tao.zh...@amd.com>; Skvortsov, > > Victor <victor.skvort...@amd.com> > > Subject: [PATCH] drm/amdgpu: Disable ACA on VFs > > > > VFs query RAS error counts directly from host with > > AMDGPU_RAS_VIRT_ERROR_COUNT_QUERY. When ACA is enabled, an > unusable > > aca_sysfs is created rather than amdgpu_ras_sysfs_create() > > > > Likewise, VFs depend on host support to query CPERs, rather than ACA > component. > > > > Signed-off-by: Victor Skvortsov <victor.skvort...@amd.com> > > --- > > drivers/gpu/drm/amd/amdgpu/amdgpu_cper.c | 4 ++-- > > drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 10 ++++++---- > > 2 files changed, 8 insertions(+), 6 deletions(-) > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cper.c > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_cper.c > > index 360e07a5c7c1..5a234eadae8b 100644 > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cper.c > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cper.c > > @@ -549,7 +549,7 @@ int amdgpu_cper_init(struct amdgpu_device *adev) { > > int r; > > > > - if (!amdgpu_aca_is_enabled(adev)) > > + if (!amdgpu_aca_is_enabled(adev) && > > + !amdgpu_sriov_ras_cper_en(adev)) > > [Tao] can we put amdgpu_sriov_ras_cper_en into amdgpu_aca_is_enabled?
[Victor] This will cause problems inside amdgpu_ras_sysfs_create() since VFs use the legacy sysfs to report IP block error counts through AMDGPU_RAS_VIRT_ERROR_COUNT_QUERY. > > > return 0; > > > > r = amdgpu_cper_ring_init(adev); @@ -568,7 +568,7 @@ int > > amdgpu_cper_init(struct amdgpu_device *adev) > > > > int amdgpu_cper_fini(struct amdgpu_device *adev) { > > - if (!amdgpu_aca_is_enabled(adev)) > > + if (!amdgpu_aca_is_enabled(adev) && > > + !amdgpu_sriov_ras_cper_en(adev)) > > return 0; > > > > adev->cper.enabled = false; > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c > > index ebf1f63d0442..5bb7673fd28e 100644 > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c > > @@ -3794,10 +3794,12 @@ static void amdgpu_ras_check_supported(struct > > amdgpu_device *adev) > > adev->ras_hw_enabled & amdgpu_ras_mask; > > > > /* aca is disabled by default except for psp > > v13_0_6/v13_0_12/v13_0_14 */ > > - adev->aca.is_enabled = > > - (amdgpu_ip_version(adev, MP0_HWIP, 0) == IP_VERSION(13, 0, 6) > > || > > - amdgpu_ip_version(adev, MP0_HWIP, 0) == IP_VERSION(13, 0, 12) > > || > > - amdgpu_ip_version(adev, MP0_HWIP, 0) == IP_VERSION(13, 0, > > 14)); > > + if (!amdgpu_sriov_vf(adev)) { > > + adev->aca.is_enabled = > > + (amdgpu_ip_version(adev, MP0_HWIP, 0) == > > IP_VERSION(13, 0, 6) || > > + amdgpu_ip_version(adev, MP0_HWIP, 0) == > > IP_VERSION(13, 0, 12) || > > + amdgpu_ip_version(adev, MP0_HWIP, 0) == > > IP_VERSION(13, 0, 14)); > > + } > > > > /* bad page feature is not applicable to specific app platform */ > > if (adev->gmc.is_app_apu && > > -- > > 2.34.1 >