Hi Prike,
no, that can lead to massive problems in a real OOM situation and is not
something we can do here.
Christian.
Am 15.05.19 um 04:00 schrieb Liang, Prike:
Hi Christian ,
I just wonder when encounter ENOMEM error during pin amdgpu BOs can we
retry validate again as below.
With the following simply patch the Abaqus pinned issue not observed.
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
index 11cbf63..72a32f5 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
@@ -902,11 +902,15 @@ int amdgpu_bo_pin_restricted(struct amdgpu_bo
*bo, u32 domain,
bo->placements[i].lpfn = lpfn;
bo->placements[i].flags |= TTM_PL_FLAG_NO_EVICT;
}
-
+retry:
r = ttm_bo_validate(&bo->tbo, &bo->placement, &ctx);
if (unlikely(r)) {
- dev_err(adev->dev, "%p pin failed\n", bo);
- goto error;
+ if (r == -ENOMEM){
+ goto retry;
+ } else {
+ dev_err(adev->dev, "%p pin failed\n", bo);
+ goto error;
+ }
}
bo->pin_count = 1;
Thanks,
Prike
*From:* Marek Olšák <mar...@gmail.com>
*Sent:* Wednesday, May 15, 2019 3:33 AM
*To:* Christian König <ckoenig.leichtzumer...@gmail.com>
*Cc:* Zhou, David(ChunMing) <david1.z...@amd.com>; Liang, Prike
<prike.li...@amd.com>; dri-devel <dri-devel@lists.freedesktop.org>;
amd-gfx mailing list <amd-...@lists.freedesktop.org>
*Subject:* Re: [PATCH 11/11] drm/amdgpu: stop removing BOs from the
LRU during CS
[CAUTION: External Email]
This series fixes the OOM errors. However, if I torture the kernel
driver more, I can get it to deadlock and end up with unkillable
processes. I can also get an OOM error. I just ran the test 5 times:
AMD_DEBUG=testgdsmm glxgears & AMD_DEBUG=testgdsmm glxgears &
AMD_DEBUG=testgdsmm glxgears & AMD_DEBUG=testgdsmm glxgears &
AMD_DEBUG=testgdsmm glxgears
Marek
On Tue, May 14, 2019 at 8:31 AM Christian König
<ckoenig.leichtzumer...@gmail.com
<mailto:ckoenig.leichtzumer...@gmail.com>> wrote:
This avoids OOM situations when we have lots of threads
submitting at the same time.
Signed-off-by: Christian König <christian.koe...@amd.com
<mailto:christian.koe...@amd.com>>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index fff558cf385b..f9240a94217b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -648,7 +648,7 @@ static int amdgpu_cs_parser_bos(struct
amdgpu_cs_parser *p,
}
r = ttm_eu_reserve_buffers(&p->ticket, &p->validated, true,
- &duplicates, true);
+ &duplicates, false);
if (unlikely(r != 0)) {
if (r != -ERESTARTSYS)
DRM_ERROR("ttm_eu_reserve_buffers failed.\n");
--
2.17.1
_______________________________________________
amd-gfx mailing list
amd-...@lists.freedesktop.org <mailto:amd-...@lists.freedesktop.org>
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel