Am 11.12.2017 um 22:29 schrieb Marek Olšák:
From: Marek Olšák <marek.ol...@amd.com>

Signed-off-by: Marek Olšák <marek.ol...@amd.com>
---

Is this really correct? I have no easy way to test it.

It's a step in the right direction, but I would rather vote for something else:

Instead of disabling the timeout by default we only disable the GPU reset/recovery.

The idea is to add a new parameter amdgpu_gpu_recovery which makes amdgpu_gpu_recover only prints out an error and doesn't touch the GPU at all (on bare metal systems).

Then we finally set the amdgpu_lockup_timeout to a non zero value by default.

Andrey could you take care of this when you have time?

Thanks,
Christian.


  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 4 ++++
  1 file changed, 4 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 8d03baa..56c41cf 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -3018,20 +3018,24 @@ static int amdgpu_reset_sriov(struct amdgpu_device 
*adev, uint64_t *reset_flags,
   *
   * Attempt to reset the GPU if it has hung (all asics).
   * Returns 0 for success or an error on failure.
   */
  int amdgpu_gpu_recover(struct amdgpu_device *adev, struct amdgpu_job *job)
  {
        struct drm_atomic_state *state = NULL;
        uint64_t reset_flags = 0;
        int i, r, resched;
+ /* amdgpu.lockup_timeout=0 disables GPU reset. */
+       if (amdgpu_lockup_timeout == 0)
+               return 0;
+
        if (!amdgpu_check_soft_reset(adev)) {
                DRM_INFO("No hardware hang detected. Did some blocks stall?\n");
                return 0;
        }
dev_info(adev->dev, "GPU reset begin!\n"); mutex_lock(&adev->lock_reset);
        atomic_inc(&adev->gpu_reset_counter);
        adev->in_gpu_reset = 1;

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Reply via email to