Re: [PATCH] drm/amdgpu: fix kernel page fault issue by ras recovery on sGPU

2020-04-17 Thread Pan, Xinhui
: Friday, April 17, 2020 5:17:22 PM To: Chen, Guchun ; amd-gfx@lists.freedesktop.org ; Zhang, Hawking ; Li, Dennis ; Clements, John Subject: Re: [PATCH] drm/amdgpu: fix kernel page fault issue by ras recovery on sGPU Am 16.04.20 um 17:47 schrieb Guchun Chen: > When running ras uncorrectable er

Re: [PATCH] drm/amdgpu: fix kernel page fault issue by ras recovery on sGPU

2020-04-17 Thread Pan, Xinhui
Sent: Friday, April 17, 2020 5:17:22 PM To: Chen, Guchun ; amd-gfx@lists.freedesktop.org ; Zhang, Hawking ; Li, Dennis ; Clements, John Subject: Re: [PATCH] drm/amdgpu: fix kernel page fault issue by ras recovery on sGPU Am 16.04.20 um 17:47 schrieb Guchun Chen: > When running

Re: [PATCH] drm/amdgpu: fix kernel page fault issue by ras recovery on sGPU

2020-04-17 Thread Christian König
Am 16.04.20 um 17:47 schrieb Guchun Chen: When running ras uncorrectable error injection and trigger GPU reset on sGPU, below issue is observed. It's caused by the list uninitialized when accessing. [ 80.047227] BUG: unable to handle page fault for address: c0f4f750 [ 80.047300] #PF:

RE: [PATCH] drm/amdgpu: fix kernel page fault issue by ras recovery on sGPU

2020-04-16 Thread Clements, John
: fix kernel page fault issue by ras recovery on sGPU When running ras uncorrectable error injection and trigger GPU reset on sGPU, below issue is observed. It's caused by the list uninitialized when accessing. [ 80.047227] BUG: unable to handle page fault for address: c0f

[PATCH] drm/amdgpu: fix kernel page fault issue by ras recovery on sGPU

2020-04-16 Thread Guchun Chen
When running ras uncorrectable error injection and trigger GPU reset on sGPU, below issue is observed. It's caused by the list uninitialized when accessing. [ 80.047227] BUG: unable to handle page fault for address: c0f4f750 [ 80.047300] #PF: supervisor write access in kernel mode [