[PATCH 03/10] drm/amdgpu: clear ras controller status registers when interrupt occurs

2019-11-27 Thread Le Ma
To fix issue that ras controller interrupt cannot be triggered anymore after one time nbif uncorrectable error. And error count is stored in nbif ras object for query. Change-Id: Iba482c169fdff3e9c390072c0289a622a522133c Signed-off-by: Le Ma --- drivers/gpu/drm/amd/amdgpu/nbio_v7_4.c | 10 ++

[PATCH 06/10] drm/amdgpu: add condition to enable baco for xgmi/ras case

2019-11-27 Thread Le Ma
Avoid to change default reset behavior for production card by checking amdgpu_ras_enable equal to 2. And only new enough smu ucode can support baco for xgmi/ras case. Change-Id: I07c3e6862be03e068745c73db8ea71f428ecba6b Signed-off-by: Le Ma --- drivers/gpu/drm/amd/amdgpu/soc15.c | 4 +++- 1 file

[PATCH 02/10] drm/amdgpu: export amdgpu_ras_find_obj to use externally

2019-11-27 Thread Le Ma
Change it to external interface. Change-Id: I2ab61f149c84a05a6f883a4c7415ea8012ec03a6 Signed-off-by: Le Ma --- drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 5 + drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h | 3 +++ 2 files changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/amd/am

[PATCH 09/10] drm/amdgpu: clear err_event_athub flag after reset exit

2019-11-27 Thread Le Ma
Otherwise next err_event_athub error cannot call gpu reset. And following resume sequence will not be affected by this flag. v2: create function to clear amdgpu_ras_in_intr for modularity of ras driver Change-Id: I5cd293f30f23876bf2a1860681bcb50f47713ecd Signed-off-by: Le Ma --- drivers/gpu/drm

[PATCH 05/10] drm/amdgpu: enable/disable doorbell interrupt in baco entry/exit helper

2019-11-27 Thread Le Ma
This operation is needed when baco entry/exit for ras recovery Change-Id: I535c7231693f3138a8e3d5acd55672e2ac68232f Signed-off-by: Le Ma --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 19 --- 1 file changed, 12 insertions(+), 7 deletions(-) diff --git a/drivers/gpu/drm/amd/amd

[PATCH 01/10] drm/amdgpu: remove ras global recovery handling from ras_controller_int handler

2019-11-27 Thread Le Ma
From: Le Ma v2: add notification when ras controller interrupt generates Change-Id: Ic03e42e9d1c4dab1fa7f4817c191a16e485b48a9 Signed-off-by: Le Ma --- drivers/gpu/drm/amd/amdgpu/nbio_v7_4.c | 7 ++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/nbi

[PATCH 10/10] drm/amdgpu: reduce redundant uvd context lost warning message

2019-11-27 Thread Le Ma
Move the print out of uvd instance loop in amdgpu_uvd_suspend Change-Id: Ifad997debd84763e1b55d668e144b729598f115e Signed-off-by: Le Ma --- drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c b/d

[PATCH 07/10] drm/amdgpu: add concurrent baco reset support for XGMI

2019-11-27 Thread Le Ma
Currently each XGMI node reset wq does not run in parrallel because same work item bound to same cpu runs in sequence. So change to bound the xgmi_reset_work item to different cpus. XGMI requires all nodes enter into baco within very close proximity before any node exit baco. So schedule the xgmi_

[PATCH 08/10] drm/amdgpu: support full gpu reset workflow when ras err_event_athub occurs

2019-11-27 Thread Le Ma
This athub fatal error can be recovered by baco without system-level reboot, so add a mode to use baco for the recovery. Not affect the default psp reset situations for now. Change-Id: Ib17f2a39254ff6b0473a785752adfdfea79d0e0d Signed-off-by: Le Ma --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |

[PATCH v3 2/2] drm: share address space for dma bufs

2019-11-27 Thread Gerd Hoffmann
Use the shared address space of the drm device (see drm_open() in drm_file.c) for dma-bufs too. That removes a difference betweem drm device mmap vmas and dma-buf mmap vmas and fixes corner cases like dropping ptes (using madvise(DONTNEED) for example) not working properly. Also remove amdgpu dri

RE: [PATCH 10/10] drm/amdgpu: reduce redundant uvd context lost warning message

2019-11-27 Thread Chen, Guchun
[AMD Official Use Only - Internal Distribution Only] -Original Message- From: Le Ma Sent: Wednesday, November 27, 2019 5:15 PM To: amd-gfx@lists.freedesktop.org Cc: Zhang, Hawking ; Chen, Guchun ; Zhou1, Tao ; Li, Dennis ; Deucher, Alexander ; Ma, Le Subject: [PATCH 10/10] drm/amdg

RE: [PATCH 10/10] drm/amdgpu: reduce redundant uvd context lost warning message

2019-11-27 Thread Ma, Le
-Original Message- From: Chen, Guchun Sent: Wednesday, November 27, 2019 5:50 PM To: Ma, Le ; amd-gfx@lists.freedesktop.org Cc: Zhang, Hawking ; Zhou1, Tao ; Li, Dennis ; Deucher, Alexander ; Ma, Le Subject: RE: [PATCH 10/10] drm/amdgpu: reduce redundant uvd context lost warning me

[PATCH 10/10 v2] drm/amdgpu: reduce redundant uvd context lost warning message

2019-11-27 Thread Le Ma
Move the print out of uvd instance loop in amdgpu_uvd_suspend v2: drop unnecessary brackets Change-Id: Ifad997debd84763e1b55d668e144b729598f115e Signed-off-by: Le Ma --- drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c | 10 ++ 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/driver

Re: [PATCH 10/10 v2] drm/amdgpu: reduce redundant uvd context lost warning message

2019-11-27 Thread Christian König
Am 27.11.19 um 11:02 schrieb Le Ma: Move the print out of uvd instance loop in amdgpu_uvd_suspend v2: drop unnecessary brackets Change-Id: Ifad997debd84763e1b55d668e144b729598f115e Signed-off-by: Le Ma --- drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c | 10 ++ 1 file changed, 6 insertions

RE: [PATCH 10/10 v2] drm/amdgpu: reduce redundant uvd context lost warning message

2019-11-27 Thread Ma, Le
-Original Message- From: Christian König Sent: Wednesday, November 27, 2019 6:08 PM To: Ma, Le ; amd-gfx@lists.freedesktop.org Cc: Chen, Guchun ; Zhou1, Tao ; Deucher, Alexander ; Li, Dennis ; Zhang, Hawking Subject: Re: [PATCH 10/10 v2] drm/amdgpu: reduce redundant uvd context lost

[PATCH 10/10 v3] drm/amdgpu: reduce redundant uvd context lost warning message

2019-11-27 Thread Le Ma
Move the print out of uvd instance loop in amdgpu_uvd_suspend v2: drop unnecessary brackets v3: grab ras_intr state once for multiple times use Change-Id: Ifad997debd84763e1b55d668e144b729598f115e Signed-off-by: Le Ma --- drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c | 11 +++ 1 file changed,

RE: [PATCH 06/10] drm/amdgpu: add condition to enable baco for xgmi/ras case

2019-11-27 Thread Zhang, Hawking
[AMD Public Use] After thinking it a bit, I think we can just rely on PMFW version to decide to go RAS recovery or legacy fatal_error handling for the platforms that support RAS. Leveraging amdgpu_ras_enable as a temporary solution seems not necessary? Even baco ras recovery not stable, it is t

RE: [PATCH 06/10] drm/amdgpu: add condition to enable baco for xgmi/ras case

2019-11-27 Thread Zhang, Hawking
[AMD Public Use] And It is still necessary to put all the condition check in a function. I mean a function that decide to go ras recovery or legacy fatal_error handling. The PMFW version that support RAS recovery will be different among ASICs. Current version check only works for VG20. In fact,

RE: [PATCH 05/10] drm/amdgpu: enable/disable doorbell interrupt in baco entry/exit helper

2019-11-27 Thread Zhang, Hawking
[AMD Official Use Only - Internal Distribution Only] Please check my comments inline Regards, Hawking -Original Message- From: Le Ma Sent: 2019年11月27日 17:15 To: amd-gfx@lists.freedesktop.org Cc: Zhang, Hawking ; Chen, Guchun ; Zhou1, Tao ; Li, Dennis ; Deucher, Alexander ; Ma, Le

RE: [PATCH 05/10] drm/amdgpu: enable/disable doorbell interrupt in baco entry/exit helper

2019-11-27 Thread Ma, Le
From: Zhang, Hawking Sent: Wednesday, November 27, 2019 8:04 PM To: Ma, Le ; amd-gfx@lists.freedesktop.org Cc: Chen, Guchun ; Zhou1, Tao ; Li, Dennis ; Deucher, Alexander ; Ma, Le Subject: RE: [PATCH 05/10] drm/amdgpu: enable/disable doorbell interrupt in baco entry/exit helper Please chec

RE: [PATCH 06/10] drm/amdgpu: add condition to enable baco for xgmi/ras case

2019-11-27 Thread Ma, Le
Agree with your thoughts that we drop amdgpu_ras_enable=2 condition. The only concern in my side is that besides fatal_error, another result may happen that atombios_init timeout on xgmi by baco (not sure psp mode1 reset causes this as well). Assuming no amdgpu_ras_enable=2 check, if PMFW > 4

[PATCH 06/10 v2] drm/amdgpu: add condition to enable baco for ras recovery

2019-11-27 Thread Le Ma
Switch to baco reset method for ras recovery if baco-supported PMFW ready. If not, keep the original reset method. Change-Id: I07c3e6862be03e068745c73db8ea71f428ecba6b Signed-off-by: Le Ma --- drivers/gpu/drm/amd/amdgpu/soc15.c | 18 -- 1 file changed, 8 insertions(+), 10 deletio

RE: [PATCH 06/10] drm/amdgpu: add condition to enable baco for xgmi/ras case

2019-11-27 Thread Ma, Le
Hi Hawking, Please check this v2 patch which is just sent out. And as discussed, we decide to still leverage the current reset_method() function with functionality/change scale/code maintainability balanced . Thanks. Regards, Ma Le -Original Message- From: Zhang, Hawking Sent: Wednes

Deadlock on PTEs update for HMM

2019-11-27 Thread Sierra Guiza, Alejandro (Alex)
Hi Christian, As you know, we're working on the HMM enablement. Im working on the dGPU page table entries invalidation on the userptr mapping case. Currently, the MMU notifiers handle stops all user mode queues, schedule a delayed worker to re-validate userptr mappings and restart the queues. Pa

Re: Deadlock on PTEs update for HMM

2019-11-27 Thread Christian König
Hi Alejandro, yes I'm very aware of this issue, but unfortunately can't give an easy solution either. I'm working for over a year now on getting this fixed, but unfortunately it turned out that this problem is much bigger than initially thought. Setting the appropriate GFP flags for the job

[PATCH 07/10 v2] drm/amdgpu: add concurrent baco reset support for XGMI

2019-11-27 Thread Le Ma
Currently each XGMI node reset wq does not run in parrallel because same work item bound to same cpu runs in sequence. So change to bound the xgmi_reset_work item to different cpus. XGMI requires all nodes enter into baco within very close proximity before any node exit baco. So schedule the xgmi_

Re: [PATCH v3 1/2] drm/sched: Avoid job cleanup if sched thread is parked.

2019-11-27 Thread Andrey Grodzovsky
Ping... Andrey On 11/26/19 10:36 AM, Andrey Grodzovsky wrote: On 11/26/19 4:08 AM, Christian König wrote: Am 25.11.19 um 17:51 schrieb Steven Price: On 25/11/2019 14:10, Andrey Grodzovsky wrote: When the sched thread is parked we assume ring_mirror_list is not accessed from here. FWIW I don

Re: [PATCH v3 1/2] drm/sched: Avoid job cleanup if sched thread is parked.

2019-11-27 Thread Christian König
Am 27.11.19 um 16:32 schrieb Andrey Grodzovsky: Ping... Andrey On 11/26/19 10:36 AM, Andrey Grodzovsky wrote: On 11/26/19 4:08 AM, Christian König wrote: Am 25.11.19 um 17:51 schrieb Steven Price: On 25/11/2019 14:10, Andrey Grodzovsky wrote: When the sched thread is parked we assume ring_

Re: [PATCH 07/10] drm/amdgpu: add concurrent baco reset support for XGMI

2019-11-27 Thread Andrey Grodzovsky
On 11/27/19 4:15 AM, Le Ma wrote: Currently each XGMI node reset wq does not run in parrallel because same work item bound to same cpu runs in sequence. So change to bound the xgmi_reset_work item to different cpus. It's not the same work item, see more bellow XGMI requires all nodes enter

[PATCH v2] drm: drop DRM_AUTH from PRIME_TO/FROM_HANDLE ioctls

2019-11-27 Thread Emil Velikov
From: Emil Velikov Current validation requires that we're authenticated, even though we can bypass (by design) the authentication when using a render node. Let's address the former by following the design decision. v2: Add simpler validation in the ioctls themselves (Boris) Cc: Alex Deucher C

Re: [PATCH 5/5] drm: drop DRM_AUTH from PRIME_TO/FROM_HANDLE ioctls

2019-11-27 Thread Emil Velikov
On Wed, 27 Nov 2019 at 07:41, Boris Brezillon wrote: > > Hi Emil, > > On Fri, 1 Nov 2019 13:03:13 + > Emil Velikov wrote: > > > From: Emil Velikov > > > > As mentioned by Christian, for drivers which support only primary nodes > > this changes the returned error from -EACCES into -EOPNOTSUP

Re: [PATCH] drm/amd/display: Get NV14 specific ip params as needed

2019-11-27 Thread Kazlauskas, Nicholas
On 2019-11-26 4:32 p.m., Zhan Liu wrote: [Why] NV14 is using its own ip params that's different from other DCN2.0 ASICs. [How] Add ASIC revision check to make sure NV14 gets correct ip params. Signed-off-by: Zhan Liu Reviewed-by: Nicholas Kazlauskas --- drivers/gpu/drm/amd/display/dc/dc

Re: [PATCH][next] drm/amd/display: fix double assignment to msg_id field

2019-11-27 Thread Harry Wentland
On 2019-11-20 12:22 p.m., Colin King wrote: > From: Colin Ian King > > The msg_id field is being assigned twice. Fix this by replacing the second > assignment with an assignment to msg_size. > > Addresses-Coverity: ("Unused value") > Fixes: 11a00965d261 ("drm/amd/display: Add PSP block to verify

[PATCH 4/5] drm/amd/powerplay: Remove unneeded variable 'result' in vega12_hwmgr.c

2019-11-27 Thread zhengbin
Fixes coccicheck warning: drivers/gpu/drm/amd/powerplay/hwmgr/vega12_hwmgr.c:502:5-11: Unneeded variable: "result". Return "0" on line 515 Reported-by: Hulk Robot Signed-off-by: zhengbin --- drivers/gpu/drm/amd/powerplay/hwmgr/vega12_hwmgr.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletio

[PATCH 5/5] drm/amd/powerplay: Remove unneeded variable 'ret' in amdgpu_smu.c

2019-11-27 Thread zhengbin
Fixes coccicheck warning: drivers/gpu/drm/amd/powerplay/amdgpu_smu.c:1192:5-8: Unneeded variable: "ret". Return "0" on line 1195 drivers/gpu/drm/amd/powerplay/amdgpu_smu.c:1945:5-8: Unneeded variable: "ret". Return "0" on line 1961 Reported-by: Hulk Robot Signed-off-by: zhengbin --- drivers/

[PATCH 1/5] drm/amd/powerplay: Remove unneeded variable 'result' in smu10_hwmgr.c

2019-11-27 Thread zhengbin
Fixes coccicheck warning: drivers/gpu/drm/amd/powerplay/hwmgr/smu10_hwmgr.c:1154:5-11: Unneeded variable: "result". Return "0" on line 1159 Reported-by: Hulk Robot Signed-off-by: zhengbin --- drivers/gpu/drm/amd/powerplay/hwmgr/smu10_hwmgr.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletion

[PATCH 3/5] drm/amd/powerplay: Remove unneeded variable 'ret' in smu7_hwmgr.c

2019-11-27 Thread zhengbin
Fixes coccicheck warning: drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c:5188:5-8: Unneeded variable: "ret". Return "0" on line 5196 Reported-by: Hulk Robot Signed-off-by: zhengbin --- drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-)

[PATCH 2/5] drm/amd/powerplay: Remove unneeded variable 'result' in vega10_hwmgr.c

2019-11-27 Thread zhengbin
Fixes coccicheck warning: drivers/gpu/drm/amd/powerplay/hwmgr/vega10_hwmgr.c:4363:5-11: Unneeded variable: "result". Return "0" on line 4370 Reported-by: Hulk Robot Signed-off-by: zhengbin --- drivers/gpu/drm/amd/powerplay/hwmgr/vega10_hwmgr.c | 3 +-- 1 file changed, 1 insertion(+), 2 deleti

[PATCH 0/5] drm/amd/powerplay: Remove unneeded variable

2019-11-27 Thread zhengbin
zhengbin (5): drm/amd/powerplay: Remove unneeded variable 'result' in smu10_hwmgr.c drm/amd/powerplay: Remove unneeded variable 'result' in vega10_hwmgr.c drm/amd/powerplay: Remove unneeded variable 'ret' in smu7_hwmgr.c drm/amd/powerplay: Remove unneeded variable 'result' in vega12_hwmgr.c

Re: [PATCH 5/5] drm: drop DRM_AUTH from PRIME_TO/FROM_HANDLE ioctls

2019-11-27 Thread Daniel Vetter
On Wed, Nov 27, 2019 at 04:27:29PM +, Emil Velikov wrote: > On Wed, 27 Nov 2019 at 07:41, Boris Brezillon > wrote: > > > > Hi Emil, > > > > On Fri, 1 Nov 2019 13:03:13 + > > Emil Velikov wrote: > > > > > From: Emil Velikov > > > > > > As mentioned by Christian, for drivers which support

Re: [PATCH 5/5] drm: drop DRM_AUTH from PRIME_TO/FROM_HANDLE ioctls

2019-11-27 Thread Emil Velikov
On Wed, 27 Nov 2019 at 18:04, Daniel Vetter wrote: > > On Wed, Nov 27, 2019 at 04:27:29PM +, Emil Velikov wrote: > > On Wed, 27 Nov 2019 at 07:41, Boris Brezillon > > wrote: > > > > > > Hi Emil, > > > > > > On Fri, 1 Nov 2019 13:03:13 + > > > Emil Velikov wrote: > > > > > > > From: Emil

Re: [PATCH 5/5] drm: drop DRM_AUTH from PRIME_TO/FROM_HANDLE ioctls

2019-11-27 Thread Daniel Vetter
On Wed, Nov 27, 2019 at 06:32:56PM +, Emil Velikov wrote: > On Wed, 27 Nov 2019 at 18:04, Daniel Vetter wrote: > > > > On Wed, Nov 27, 2019 at 04:27:29PM +, Emil Velikov wrote: > > > On Wed, 27 Nov 2019 at 07:41, Boris Brezillon > > > wrote: > > > > > > > > Hi Emil, > > > > > > > > On Fri

Re: [PATCH] drm/amdgpu: implement TMZ accessor (v2)

2019-11-27 Thread Alex Deucher
On Tue, Nov 26, 2019 at 9:03 PM Luben Tuikov wrote: > > Implement an accessor of adev->tmz.enabled. Let not > code around access it as "if (adev->tmz.enabled)" > as the organization may change. Instead... > > Recruit "bool amdgpu_is_tmz(adev)" to return > exactly this Boolean value. That is, this

[PATCH] drm/amdgpu: move CS secure flag next the structs where it's used

2019-11-27 Thread Alex Deucher
So it's not mixed up with the CTX stuff. Signed-off-by: Alex Deucher --- include/uapi/drm/amdgpu_drm.h | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/include/uapi/drm/amdgpu_drm.h b/include/uapi/drm/amdgpu_drm.h index f75c6957064d..918ac3548cd3 100644 --- a/include/uap

RE: [PATCH] drm/amdgpu: move CS secure flag next the structs where it's used

2019-11-27 Thread Liu, Zhan
> -Original Message- > From: amd-gfx On Behalf Of Alex > Deucher > Sent: 2019/November/27, Wednesday 3:57 PM > To: amd-gfx@lists.freedesktop.org > Cc: Deucher, Alexander > Subject: [PATCH] drm/amdgpu: move CS secure flag next the structs where it's > used > > So it's not mixed up with

Re: [PATCH] drm/amdgpu: implement TMZ accessor (v2)

2019-11-27 Thread Luben Tuikov
On 2019-11-27 3:37 p.m., Alex Deucher wrote: > On Tue, Nov 26, 2019 at 9:03 PM Luben Tuikov wrote: >> >> Implement an accessor of adev->tmz.enabled. Let not >> code around access it as "if (adev->tmz.enabled)" >> as the organization may change. Instead... >> >> Recruit "bool amdgpu_is_tmz(adev)" t

[PATCH] drm/amdgpu: implement TMZ accessor (v3)

2019-11-27 Thread Luben Tuikov
Implement an accessor of adev->tmz.enabled. Let not code around access it as "if (adev->tmz.enabled)" as the organization may change. Instead... Recruit "bool amdgpu_is_tmz(adev)" to return exactly this Boolean value. That is, this function is now an accessor of an already initialized and set adev

RE: [PATCH 1/5] drm/amdgpu: fix GFX10 missing CSIB set

2019-11-27 Thread Liu, Monk
Ping _ Monk Liu|GPU Virtualization Team |AMD -Original Message- From: Monk Liu Sent: Tuesday, November 26, 2019 7:50 PM To: amd-gfx@lists.freedesktop.org Cc: Liu, Monk Subject: [PATCH 1/5] drm/amdgpu: fix GFX10 missing CSIB set still need to init

RE: [PATCH 3/5] drm/amdgpu: do autoload right after MEC loaded for SRIOV VF

2019-11-27 Thread Liu, Monk
ping _ Monk Liu|GPU Virtualization Team |AMD -Original Message- From: Monk Liu Sent: Tuesday, November 26, 2019 7:50 PM To: amd-gfx@lists.freedesktop.org Cc: Liu, Monk Subject: [PATCH 3/5] drm/amdgpu: do autoload right after MEC loaded for SRIOV VF

RE: [PATCH 2/5] drm/amdgpu: skip rlc ucode loading for SRIOV gfx10

2019-11-27 Thread Liu, Monk
_ Monk Liu|GPU Virtualization Team |AMD -Original Message- From: Monk Liu Sent: Tuesday, November 26, 2019 7:50 PM To: amd-gfx@lists.freedesktop.org Cc: Liu, Monk Subject: [PATCH 2/5] drm/amdgpu: skip rlc ucode loading for SRIOV gfx10 Signed-off-b

RE: [PATCH 4/5] drm/amdgpu: use CPU to flush vmhub if sched stopped

2019-11-27 Thread Liu, Monk
ping _ Monk Liu|GPU Virtualization Team |AMD -Original Message- From: Monk Liu Sent: Tuesday, November 26, 2019 7:50 PM To: amd-gfx@lists.freedesktop.org Cc: Liu, Monk Subject: [PATCH 4/5] drm/amdgpu: use CPU to flush vmhub if sched stopped otherws

RE: [PATCH 5/5] drm/amdgpu: fix calltrace during kmd unload

2019-11-27 Thread Liu, Monk
ping _ Monk Liu|GPU Virtualization Team |AMD -Original Message- From: Monk Liu Sent: Tuesday, November 26, 2019 7:50 PM To: amd-gfx@lists.freedesktop.org Cc: Liu, Monk Subject: [PATCH 5/5] drm/amdgpu: fix calltrace during kmd unload kernel would re

RE: [PATCH 4/5] drm/amdgpu: use CPU to flush vmhub if sched stopped

2019-11-27 Thread Liu, Monk
Christian >> Good catch, but you are somehow messing up the indentation here. I cannot align with the indentation, because my coding style check script (we use it to push code to gerritgit) requires me to use "tab" instead of "space" It means the current coding style is in fact wrong _

RE: [PATCH 5/5] drm/amdgpu: fix calltrace during kmd unload

2019-11-27 Thread Liu, Monk
Hi Xiaojie For SRIOV we don't use suspend so I didn't think to that part, thanks for the remind ! But we still need to fix this call trace issue anyway (our jenkins testing system consider such call trace as an error ) How about we do " adev->gfx.rlc.funcs->get_csb_buffer(adev, dst_ptr);" in

Re: [PATCH 5/5] drm/amdgpu: fix calltrace during kmd unload

2019-11-27 Thread Yuan, Xiaojie
[AMD Official Use Only - Internal Distribution Only] Hi Monk, As long as the content of CSIB won't be changed by CP FW in runtime, I have no objection to 're-initialize after S3 resume'. I am not quite sure about the actual behavior, let me do an experiment to confirm that and add Hawking / Jac

[PATCH] drm/amdgpu: fix calltrace during kmd unload(v2)

2019-11-27 Thread Monk Liu
kernel would report a warning on double unpin on the csb BO because we unpin it during hw_fini but actually we don't need to pin/unpin it during hw_init/fini since it is created with kernel pinned v2: get_csb in init_rlc so hw_init() will make CSIB content back even after reset or s3. take care of

Re: GART write flush error on SI w/ amdgpu

2019-11-27 Thread Dave Airlie
On Wed, 21 Jun 2017 at 00:03, Marek Olšák wrote: > > On Tue, Jun 20, 2017 at 1:46 PM, Christian König > wrote: > > Am 20.06.2017 um 12:34 schrieb Marek Olšák: > >> > >> BTW, I noticed the flush sequence in the kernel is wrong. The correct > >> flush sequence should be: > >> > >> 1) EVENT_WRITE_EO

RE: [PATCH 01/10] drm/amdgpu: remove ras global recovery handling from ras_controller_int handler

2019-11-27 Thread Zhang, Hawking
[AMD Official Use Only - Internal Distribution Only] With the v2 version for patch #6, #7 and the fix to enable doorbell int after BACO exit in Patch #5, The series is Reviewed-by: Hawking Zhang Regards, Hawking -Original Message- From: Le Ma Sent: 2019年11月27日 17:15 To: amd-gfx@lis

[PATCH] drm/amdgpu: fix calltrace during kmd unload(v2)

2019-11-27 Thread Monk Liu
kernel would report a warning on double unpin on the csb BO because we unpin it during hw_fini but actually we don't need to pin/unpin it during hw_init/fini since it is created with kernel pinned v2: get_csb in init_rlc so hw_init() will make CSIB content back even after reset or s3. take care of

RE: [PATCH 05/10] drm/amdgpu: enable/disable doorbell interrupt in baco entry/exit helper

2019-11-27 Thread Zhou1, Tao
> -Original Message- > From: Le Ma > Sent: 2019年11月27日 17:15 > To: amd-gfx@lists.freedesktop.org > Cc: Zhang, Hawking ; Chen, Guchun > ; Zhou1, Tao ; Li, Dennis > ; Deucher, Alexander > ; Ma, Le > Subject: [PATCH 05/10] drm/amdgpu: enable/disable doorbell interrupt in > baco entry/exit

[PATCH] drm/amdgpu: fix calltrace during kmd unload(v2)

2019-11-27 Thread Monk Liu
kernel would report a warning on double unpin on the csb BO because we unpin it during hw_fini but actually we don't need to pin/unpin it during hw_init/fini since it is created with kernel pinned v2: get_csb in init_rlc so hw_init() will make CSIB content back even after reset or s3. take care of