RE: [PATCH] drm/scheduler: Fix bad job be re-processed in TDR

2018-11-12 Thread Huang, Trigger
Hi Christian Thanks for the correction before, I gave my explanation as below, would you help to check again, thanks in advance. The key thing here is, if a job’s fence is signaled already, then call dma_fence_set_error to its fence will lead to a kernel warning call trace static inline void dm

RE: [PATCH] drm/scheduler: Fix bad job be re-processed in TDR

2018-11-12 Thread Liu, Monk
Hi Christian Trigger's patch intends to fix the double signaling on dma_fence from below scenario: 1) Gpu_recovery(job = some job) invoked by guest TDR ==> the hang job's fence is set as -ECANCELD and fake signaled 2)GPU_recovery(job = NULL) again invoked by hypervisor or KFD, but the job of a

[PATCH] drm/amdgpu: refactor smu8_send_msg_to_smc and WARN_ON time out

2018-11-12 Thread S, Shirish
From: Daniel Kurtz This patch refactors smu8_send_msg_to_smc_with_parameter() to include smu8_send_msg_to_smc_async() so that all the messages sent to SMU can be profiled and appropriately reported if they fail. Signed-off-by: Daniel Kurtz Signed-off-by: Shirish S --- drivers/gpu/drm/amd/powe

Re: [PATCH] drm/scheduler: Fix bad job be re-processed in TDR

2018-11-12 Thread Koenig, Christian
Hi guys, yeah that is the same problem I was trying to fix with drm/sched: fix timeout handling v2. But this case is even more complicated because here it is the hypervisor or KFD which triggers the second reset. Going to take another look at this today, Christian. Am 12.11.18 um 10:19 schrie

[PATCH] drm/amdgpu: Fix CSA buffer alloc failed on Vega

2018-11-12 Thread Rex Zhu
Alloc_pte failed when the VA address located in the higher arrange of 256T. so reserve the csa buffer under 128T as a work around. [ 122.979425] amdgpu :03:00.0: va above limit (0xFFF1F >= 0x10) [ 122.987080] BUG: unable to handle kernel paging request at 880e1a79fff8

Re: [PATCH] drm/amdgpu: Fix CSA buffer alloc failed on Vega

2018-11-12 Thread Christian König
Am 12.11.18 um 11:33 schrieb Rex Zhu: Alloc_pte failed when the VA address located in the higher arrange of 256T. so reserve the csa buffer under 128T as a work around. [ 122.979425] amdgpu :03:00.0: va above limit (0xFFF1F >= 0x10) [ 122.987080] BUG: unable to handle ker

Re: [PATCH] drm/amdgpu: Fix CSA buffer alloc failed on Vega

2018-11-12 Thread Zhu, Rex
Got it . Thanks. Best Regards Rex From: Christian König Sent: Monday, November 12, 2018 6:42 PM To: Zhu, Rex; amd-gfx@lists.freedesktop.org Subject: Re: [PATCH] drm/amdgpu: Fix CSA buffer alloc failed on Vega Am 12.11.18 um 11:33 schrieb Rex Zhu: > Alloc_pte

RE: [PATCH] drm/amdgpu: log smu version

2018-11-12 Thread Zhu, Rex
Patch is Reviewed-by: Rex Zhu Best Regards Rex > -Original Message- > From: amd-gfx On Behalf Of S, > Shirish > Sent: Monday, November 12, 2018 2:50 PM > To: Zhu, Rex ; Deucher, Alexander > > Cc: amd-gfx@lists.freedesktop.org; S, Shirish > Subject: [PATCH] drm/amdgpu: log smu versio

[PATCH] drm/amdgpu: refactor smu8_send_msg_to_smc and WARN_ON time out

2018-11-12 Thread S, Shirish
From: Daniel Kurtz This patch refactors smu8_send_msg_to_smc_with_parameter() to include smu8_send_msg_to_smc_async() so that all the messages sent to SMU can be profiled and appropriately reported if they fail. Signed-off-by: Daniel Kurtz Signed-off-by: Shirish S --- drivers/gpu/drm/amd/powe

RE: [PATCH] drm/amdgpu: refactor smu8_send_msg_to_smc and WARN_ON time out

2018-11-12 Thread Zhu, Rex
Patch is Reviewed-by: Rex Zhu Best Regards Rex > -Original Message- > From: amd-gfx On Behalf Of S, > Shirish > Sent: Monday, November 12, 2018 10:48 PM > To: Zhu, Rex ; Deucher, Alexander > > Cc: Daniel Kurtz ; amd-gfx@lists.freedesktop.org; S, > Shirish > Subject: [PATCH] drm/amdgp

[PATCH 2/2] drm/atomic: Create and use __drm_atomic_helper_crtc_reset() everywhere

2018-11-12 Thread Maarten Lankhorst
We already have __drm_atomic_helper_connector_reset() and __drm_atomic_helper_plane_reset(), extend this to crtc as well. Most drivers already have a gpu reset hook, correct it. Nouveau already implemented its own __drm_atomic_helper_crtc_reset(), convert it to the common one. Signed-off-by: Maar

Re: [PATCH 2/2] drm/atomic: Create and use __drm_atomic_helper_crtc_reset() everywhere

2018-11-12 Thread Boris Brezillon
On Mon, 12 Nov 2018 16:01:14 +0100 Maarten Lankhorst wrote: > We already have __drm_atomic_helper_connector_reset() and > __drm_atomic_helper_plane_reset(), extend this to crtc as well. > > Most drivers already have a gpu reset hook, correct it. > Nouveau already implemented its own __drm_atomic

Re: [PATCH 6/8] drm/amdgpu: meld together VM fragment and huge page handling

2018-11-12 Thread Christian König
Am 09.11.18 um 13:13 schrieb Koenig, Christian: Adding Aaron who initially reported the problem with this patch and suspend/resume. Am 08.11.18 um 20:35 schrieb Samuel Pitoiset: On 11/8/18 5:50 PM, Christian König wrote: Hi Samuel, yeah, that is a known issue and I was already looking into th

Re: [PATCH 2/2] drm/atomic: Create and use __drm_atomic_helper_crtc_reset() everywhere

2018-11-12 Thread Li, Sun peng (Leo)
On 2018-11-12 10:01 AM, Maarten Lankhorst wrote: > We already have __drm_atomic_helper_connector_reset() and > __drm_atomic_helper_plane_reset(), extend this to crtc as well. > > Most drivers already have a gpu reset hook, correct it. > Nouveau already implemented its own __drm_atomic_helper_crt

Re: [PATCH 2/2] drm/atomic: Create and use __drm_atomic_helper_crtc_reset() everywhere

2018-11-12 Thread Thierry Reding
On Mon, Nov 12, 2018 at 04:01:14PM +0100, Maarten Lankhorst wrote: > We already have __drm_atomic_helper_connector_reset() and > __drm_atomic_helper_plane_reset(), extend this to crtc as well. > > Most drivers already have a gpu reset hook, correct it. > Nouveau already implemented its own __drm_a

Re: [PATCH 2/2] drm/atomic: Create and use __drm_atomic_helper_crtc_reset() everywhere

2018-11-12 Thread Heiko Stuebner
Am Montag, 12. November 2018, 16:01:14 CET schrieb Maarten Lankhorst: > We already have __drm_atomic_helper_connector_reset() and > __drm_atomic_helper_plane_reset(), extend this to crtc as well. > > Most drivers already have a gpu reset hook, correct it. > Nouveau already implemented its own __dr

Re: [PATCH 2/2] drm/atomic: Create and use __drm_atomic_helper_crtc_reset() everywhere

2018-11-12 Thread Wentland, Harry
On 2018-11-12 10:01 a.m., Maarten Lankhorst wrote: > We already have __drm_atomic_helper_connector_reset() and > __drm_atomic_helper_plane_reset(), extend this to crtc as well. > > Most drivers already have a gpu reset hook, correct it. > Nouveau already implemented its own __drm_atomic_helper_c

Re: [PATCH v2 2/2] drm/amd/display: Support amdgpu "max bpc" connector property

2018-11-12 Thread Wentland, Harry
On 2018-11-07 12:50 p.m., Nicholas Kazlauskas wrote: > [Why] > Many panels support more than 8bpc but some modes are unavailable while > running at greater than 8bpc due to DP/HDMI bandwidth constraints. > > Support for more than 8bpc was added recently in the driver but it > defaults to the maxim

Re: [PATCH 2/2] drm/atomic: Create and use __drm_atomic_helper_crtc_reset() everywhere

2018-11-12 Thread Sean Paul
On Mon, Nov 12, 2018 at 04:01:14PM +0100, Maarten Lankhorst wrote: > We already have __drm_atomic_helper_connector_reset() and > __drm_atomic_helper_plane_reset(), extend this to crtc as well. > > Most drivers already have a gpu reset hook, correct it. > Nouveau already implemented its own __drm_a

Re: [PATCH v7 3/5] drm: Document variable refresh properties

2018-11-12 Thread Wentland, Harry
On 2018-11-08 9:43 a.m., Nicholas Kazlauskas wrote: > These include the drm_connector 'vrr_capable' and the drm_crtc > 'vrr_enabled' properties. > > Signed-off-by: Nicholas Kazlauskas > Cc: Harry Wentland > Cc: Manasi Navare > Cc: Pekka Paalanen > Cc: Ville Syrjälä > Cc: Michel Dänzer Looks

[PATCH] drm/amdgpu: set system aperture to cover whole FB region

2018-11-12 Thread Liu, Shaoyun
In XGMI configuration, the FB region covers vram region from peer device, adjust system aperture to cover all of them Change-Id: I15b14727bbd11f7c2a237fb6ca7b32fb708b2e32 Signed-off-by: shaoyunl --- drivers/gpu/drm/amd/amdgpu/gfxhub_v1_0.c | 6 +++--- drivers/gpu/drm/amd/amdgpu/mmhub_v1_0.c | 6

Re: [PATCH v7 3/5] drm: Document variable refresh properties

2018-11-12 Thread Kazlauskas, Nicholas
On 11/12/18 11:12 AM, Wentland, Harry wrote: > On 2018-11-08 9:43 a.m., Nicholas Kazlauskas wrote: >> These include the drm_connector 'vrr_capable' and the drm_crtc >> 'vrr_enabled' properties. >> >> Signed-off-by: Nicholas Kazlauskas >> Cc: Harry Wentland >> Cc: Manasi Navare >> Cc: Pekka Paala

Re: [PATCH] drm/amdgpu: set system aperture to cover whole FB region

2018-11-12 Thread Alex Deucher
On Mon, Nov 12, 2018 at 12:02 PM Liu, Shaoyun wrote: > > In XGMI configuration, the FB region covers vram region from peer > device, adjust system aperture to cover all of them > > Change-Id: I15b14727bbd11f7c2a237fb6ca7b32fb708b2e32 > Signed-off-by: shaoyunl Reviewed-by: Alex Deucher > --- >

Re: [PATCH] drm/amdgpu: set system aperture to cover whole FB region

2018-11-12 Thread Christian König
Am 12.11.18 um 18:02 schrieb Liu, Shaoyun: In XGMI configuration, the FB region covers vram region from peer device, adjust system aperture to cover all of them Change-Id: I15b14727bbd11f7c2a237fb6ca7b32fb708b2e32 Signed-off-by: shaoyunl Reviewed-by: Christian König --- drivers/gpu/drm/a

[PATCH] drm/amdgpu: fix huge page handling on Vega10

2018-11-12 Thread Christian König
We accidentially set the huge flag on the parent instead of the childs. This caused some VM faults under memory pressure. Signed-off-by: Christian König --- drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 18 ++ 1 file changed, 10 insertions(+), 8 deletions(-) diff --git a/drivers/gpu/

Re: [PATCH] drm/amdgpu: fix huge page handling on Vega10

2018-11-12 Thread Alex Deucher
On Mon, Nov 12, 2018 at 12:09 PM Christian König wrote: > > We accidentially set the huge flag on the parent instead of the childs. > This caused some VM faults under memory pressure. > > Signed-off-by: Christian König Acked-by: Alex Deucher > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 18

Re: [PATCH 2/2] drm/atomic: Create and use __drm_atomic_helper_crtc_reset() everywhere

2018-11-12 Thread Maarten Lankhorst
Op 12-11-18 om 17:11 schreef Sean Paul: > On Mon, Nov 12, 2018 at 04:01:14PM +0100, Maarten Lankhorst wrote: >> We already have __drm_atomic_helper_connector_reset() and >> __drm_atomic_helper_plane_reset(), extend this to crtc as well. >> >> Most drivers already have a gpu reset hook, correct it.

Re: [Intel-gfx] [PATCH 2/2] drm/atomic: Create and use __drm_atomic_helper_crtc_reset() everywhere

2018-11-12 Thread Ville Syrjälä
On Mon, Nov 12, 2018 at 04:01:14PM +0100, Maarten Lankhorst wrote: > We already have __drm_atomic_helper_connector_reset() and > __drm_atomic_helper_plane_reset(), extend this to crtc as well. > > Most drivers already have a gpu reset hook, correct it. > Nouveau already implemented its own __drm_a

[PATCH] drm/amdgpu: fix bug with IH ring setup

2018-11-12 Thread Yang, Philip
The bug limits the IH ring wptr address to 40bit. When the system memory is bigger than 1TB, the bus address is more than 40bit, this causes the interrupt cannot be handled and cleared correctly. Change-Id: I3cd1b8ad046b38945372f2fd1a2d225624893e28 Signed-off-by: Philip Yang --- drivers/gpu/drm/

Re: [PATCH] drm/amdgpu: fix bug with IH ring setup

2018-11-12 Thread Deucher, Alexander
Reviewed-by: Alex Deucher From: amd-gfx on behalf of Yang, Philip Sent: Monday, November 12, 2018 2:20 PM To: amd-gfx@lists.freedesktop.org Cc: Yang, Philip Subject: [PATCH] drm/amdgpu: fix bug with IH ring setup The bug limits the IH ring wptr address to 40b

Re: [PATCH 2/2] drm/atomic: Create and use __drm_atomic_helper_crtc_reset() everywhere

2018-11-12 Thread Liviu Dudau
On Mon, Nov 12, 2018 at 04:01:14PM +0100, Maarten Lankhorst wrote: > We already have __drm_atomic_helper_connector_reset() and > __drm_atomic_helper_plane_reset(), extend this to crtc as well. > > Most drivers already have a gpu reset hook, correct it. > Nouveau already implemented its own __drm_a

Re: [PATCH] drm/amdgpu: fix huge page handling on Vega10

2018-11-12 Thread Samuel Pitoiset
On 11/12/18 6:16 PM, Alex Deucher wrote: On Mon, Nov 12, 2018 at 12:09 PM Christian König wrote: We accidentially set the huge flag on the parent instead of the childs. This caused some VM faults under memory pressure. Signed-off-by: Christian König Yes, this fixes GPU hangs with F12017

Re: [PATCH] drm/amdgpu: fix huge page handling on Vega10

2018-11-12 Thread Kuehling, Felix
On 2018-11-12 12:09 p.m., Christian König wrote: > We accidentially set the huge flag on the parent instead of the childs. > This caused some VM faults under memory pressure. Reviewed-by: Felix Kuehling I got a bit confused when re-reading this code. Maybe part of it is that cursor.entry is not

[PATCH] drm/amd/pp: Fix truncated clock value when set watermark

2018-11-12 Thread Rex Zhu
the clk value should be tranferred to MHz first and then transfer to uint16. otherwise, the clock value will be truncated. Reported-by: Hersen Wu Signed-off-by: Rex Zhu --- drivers/gpu/drm/amd/powerplay/hwmgr/smu_helper.c | 32 1 file changed, 16 insertions(+), 16 delet

Re: [PATCH] drm/amd/pp: Fix truncated clock value when set watermark

2018-11-12 Thread Alex Deucher
On Mon, Nov 12, 2018 at 10:34 PM Rex Zhu wrote: > > the clk value should be tranferred to MHz first and > then transfer to uint16. otherwise, the clock value > will be truncated. > > Reported-by: Hersen Wu > Signed-off-by: Rex Zhu Reviewed-by: Alex Deucher > --- > drivers/gpu/drm/amd/powerpl

Re: [PATCH] drm/amdgpu: fix huge page handling on Vega10

2018-11-12 Thread Christian König
Am 13.11.18 um 00:50 schrieb Kuehling, Felix: On 2018-11-12 12:09 p.m., Christian König wrote: We accidentially set the huge flag on the parent instead of the childs. This caused some VM faults under memory pressure. Reviewed-by: Felix Kuehling I got a bit confused when re-reading this code.