Like I stated elsewhere, I would recommend noretry=0 for Navi and later GPUs because there is no performance advantage from disabling retry on those GPUs.
Regards, Felix Am 2020-11-30 um 12:22 p.m. schrieb Deucher, Alexander: > > [AMD Public Use] > > > We need to figure out what the root cause is then. If we can't figure > it out soon, we should revert the change for navi1x and continue to > debug it until we can find the root cause and we can safely re-enable it. > > Alex > ------------------------------------------------------------------------ > *From:* Chen, Guchun <guchun.c...@amd.com> > *Sent:* Sunday, November 29, 2020 2:22 AM > *To:* Bas Nieuwenhuizen <b...@basnieuwenhuizen.nl>; Kuehling, Felix > <felix.kuehl...@amd.com> > *Cc:* Gui, Jack <jack....@amd.com>; Zhou1, Tao <tao.zh...@amd.com>; > amd-gfx mailing list <amd-gfx@lists.freedesktop.org>; Huang, Ray > <ray.hu...@amd.com>; Deucher, Alexander <alexander.deuc...@amd.com>; > Zhang, Hawking <hawking.zh...@amd.com> > *Subject:* RE: [PATCH v3] drm/amd/amdgpu: set the default value of > noretry to 1 for some dGPUs > > [AMD Public Use] > > Hi Bas Nieuwenhuizen, > > I don't think direct revert is one right approach, though it's able to > fix your problem. noretry=0 will cause other test failure on several > ASICs. > > Regards, > Guchun > > -----Original Message----- > From: amd-gfx <amd-gfx-boun...@lists.freedesktop.org> On Behalf Of Bas > Nieuwenhuizen > Sent: Sunday, November 29, 2020 8:38 AM > To: Kuehling, Felix <felix.kuehl...@amd.com> > Cc: Gui, Jack <jack....@amd.com>; Chen, Guchun <guchun.c...@amd.com>; > Zhou1, Tao <tao.zh...@amd.com>; amd-gfx mailing list > <amd-gfx@lists.freedesktop.org>; Huang, Ray <ray.hu...@amd.com>; > Deucher, Alexander <alexander.deuc...@amd.com>; Zhang, Hawking > <hawking.zh...@amd.com> > Subject: Re: [PATCH v3] drm/amd/amdgpu: set the default value of > noretry to 1 for some dGPUs > > Can we revert this patch to fix > https://gitlab.freedesktop.org/drm/amd/-/issues/1374 ? > > On Thu, Oct 15, 2020 at 4:30 PM Felix Kuehling > <felix.kuehl...@amd.com> wrote: > > > > Am 2020-10-14 um 11:35 p.m. schrieb Chengming Gui: > > > noretry = 0 cause some dGPU's kfd page fault tests fail, so set > > > noretry to 1 for these special ASICs: > > > vega20/navi10/navi14/ARCTURUS > > > > > > v2: merge raven and default case due to the same setting > > > v3: remove ARCTURUS > > > > > > Signed-off-by: Chengming Gui <jack....@amd.com> > > > Change-Id: I3be70f463a49b0cd5c56456431d6c2cb98b13872 > > > > Acked-by: Felix Kuhling <felix.kuehl...@amd.com> > > > > > > > --- > > > drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 23 > > > +++++++++++++++-------- > > > 1 file changed, 15 insertions(+), 8 deletions(-) > > > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c > > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c > > > index 36604d751d62..f26eb4e54b12 100644 > > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c > > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c > > > @@ -425,20 +425,27 @@ void amdgpu_gmc_noretry_set(struct > amdgpu_device *adev) > > > struct amdgpu_gmc *gmc = &adev->gmc; > > > > > > switch (adev->asic_type) { > > > - case CHIP_RAVEN: > > > - /* Raven currently has issues with noretry > > > - * regardless of what we decide for other > > > - * asics, we should leave raven with > > > - * noretry = 0 until we root cause the > > > - * issues. > > > + case CHIP_VEGA20: > > > + case CHIP_NAVI10: > > > + case CHIP_NAVI14: > > > + /* > > > + * noretry = 0 will cause kfd page fault tests fail > > > + * for some ASICs, so set default to 1 for these ASICs. > > > */ > > > if (amdgpu_noretry == -1) > > > - gmc->noretry = 0; > > > + gmc->noretry = 1; > > > else > > > gmc->noretry = amdgpu_noretry; > > > break; > > > + case CHIP_RAVEN: > > > default: > > > - /* default this to 0 for now, but we may want > > > + /* Raven currently has issues with noretry > > > + * regardless of what we decide for other > > > + * asics, we should leave raven with > > > + * noretry = 0 until we root cause the > > > + * issues. > > > + * > > > + * default this to 0 for now, but we may want > > > * to change this in the future for certain > > > * GPUs as it can increase performance in > > > * certain cases. > > _______________________________________________ > > amd-gfx mailing list > > amd-gfx@lists.freedesktop.org > > https://list/ <https://list> > > s.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&data=04%7C01%7Cgu > > chun.chen%40amd.com%7C6d626e2a3bae4877024f08d893ff15db%7C3dd8961fe4884 > > e608e11a82d994e183d%7C0%7C0%7C637422071085800476%7CUnknown%7CTWFpbGZsb > > 3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D% > > 7C1000&sdata=VFqegGwPCj10q3Y5BdZsVq2a%2B4Tb358mYVDaNkA9zLU%3D& > > reserved=0 > _______________________________________________ > amd-gfx mailing list > amd-gfx@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/amd-gfx _______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx