Am 11.12.2017 um 22:29 schrieb Marek Olšák:
From: Marek Olšák <marek.ol...@amd.com>
Signed-off-by: Marek Olšák <marek.ol...@amd.com>
---
Is this really correct? I have no easy way to test it.
It's a step in the right direction, but I would rather vote for
something else:
Instead of disabling the timeout by default we only disable the GPU
reset/recovery.
The idea is to add a new parameter amdgpu_gpu_recovery which makes
amdgpu_gpu_recover only prints out an error and doesn't touch the GPU at
all (on bare metal systems).
Then we finally set the amdgpu_lockup_timeout to a non zero value by
default.
Andrey could you take care of this when you have time?
Thanks,
Christian.
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 8d03baa..56c41cf 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -3018,20 +3018,24 @@ static int amdgpu_reset_sriov(struct amdgpu_device
*adev, uint64_t *reset_flags,
*
* Attempt to reset the GPU if it has hung (all asics).
* Returns 0 for success or an error on failure.
*/
int amdgpu_gpu_recover(struct amdgpu_device *adev, struct amdgpu_job *job)
{
struct drm_atomic_state *state = NULL;
uint64_t reset_flags = 0;
int i, r, resched;
+ /* amdgpu.lockup_timeout=0 disables GPU reset. */
+ if (amdgpu_lockup_timeout == 0)
+ return 0;
+
if (!amdgpu_check_soft_reset(adev)) {
DRM_INFO("No hardware hang detected. Did some blocks stall?\n");
return 0;
}
dev_info(adev->dev, "GPU reset begin!\n");
mutex_lock(&adev->lock_reset);
atomic_inc(&adev->gpu_reset_counter);
adev->in_gpu_reset = 1;
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx