Re: [PATCH 4/8] drm/amdgpu: convert kiq ring_mutex to a spinlock

2017-05-08 Thread Andres Rodriguez
On 2017-05-08 11:51 PM, Liu, Monk wrote: Can you explain your reasoning behind your current position that the KIQ shouldn't be used by baremetal amdgpu? [ML] I didn't mean KIQ shouldn't leveraged by bare-metal, instead how it is used by bare-metal is none of my interest ... I mean it better

RE: [PATCH 4/8] drm/amdgpu: convert kiq ring_mutex to a spinlock

2017-05-08 Thread Liu, Monk
>Can you explain your reasoning behind your current position that the KIQ >shouldn't be used by baremetal amdgpu? [ML] I didn't mean KIQ shouldn't leveraged by bare-metal, instead how it is used by bare-metal is none of my interest ... I mean it better not be used under SR-IOV case by other cli

Re: [PATCH 2/2] drm/amdgpu/soc15: use atomfirmware for setting bios scratch for reset

2017-05-08 Thread zhoucm1
On 2017年05月05日 22:27, Alex Deucher wrote: Need to use the atomfirmware interface rather than atombios since soc15 is atomfirmware based. Signed-off-by: Alex Deucher The series is Reviewed-by: Chunming Zhou --- drivers/gpu/drm/amd/amdgpu/soc15.c | 6 +++--- 1 file changed, 3 insertions(+

Re: [RFC] Problems with SRBM select on KIQ

2017-05-08 Thread zhoucm1
On 2017年05月06日 06:57, Felix Kuehling wrote: We ran into a similar problem when we played with priorities on KFD queues. You can't change an MQD of a currently mapped queue. To change a queue priority we need to unmap it, update the MQD, and then map it again. I wonder if there is similar requi

RE: [PATCH 4/4] drm/amdgpu/SRIOV:implement guilty job TDR (V2)

2017-05-08 Thread Liu, Monk
> > - /* block scheduler */ > - for (i = 0; i < AMDGPU_MAX_RINGS; ++i) { > - ring = adev->rings[i]; > + /* we start from the ring trigger GPU hang */ > + j = job ? job->ring->idx : 0; > + > + if (job) > + if (amd_sched_invalidate_job(&job->base, amdgpu

Re: [PATCH] drm/amdgpu: fix errors in comments.

2017-05-08 Thread William Lewis
On 05/08/2017 10:32 AM, Alex Xie wrote: > Signed-off-by: Alex Xie > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 7 +++ > 1 file changed, 3 insertions(+), 4 deletions(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > ind

Re: [PATCH 1/5] drm: introduce sync objects

2017-05-08 Thread Dave Airlie
On 4 May 2017 at 18:16, Chris Wilson wrote: > On Wed, Apr 26, 2017 at 01:28:29PM +1000, Dave Airlie wrote: >> +#include > > I wonder if Daniel has already split everything used here into its own > headers? not sure, if drm_file is out there yet. I'll find out when I rebase this onto something ne

Re: [PATCH] drm/amdgpu: Fix comments in source code

2017-05-08 Thread Michel Dänzer
On 09/05/17 10:26 AM, Alex Xie wrote: > Signed-off-by: Alex Xie > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 8 > 1 file changed, 4 insertions(+), 4 deletions(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > index 2

Re: [PATCH] drm/amdgpu: fix errors in comments.

2017-05-08 Thread Xie, AlexBin
Hi Michel, The code change has been submitted into our internal git server. I have a follow up commit in another email thread. The commit fixes more errors in comments. Thanks, Alex Bin From: Michel Dänzer Sent: Monday, May 8, 2017 9:13 PM To: Xie, AlexBin

[PATCH] drm/amdgpu: Fix comments in source code

2017-05-08 Thread Alex Xie
Signed-off-by: Alex Xie --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c index 2704f88..480f3cd 100644 --- a/drivers/gpu/drm/amd/amdgp

Re: [PATCH] drm/amdgpu: fix errors in comments.

2017-05-08 Thread Michel Dänzer
On 09/05/17 12:32 AM, Alex Xie wrote: > Signed-off-by: Alex Xie > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 7 +++ > 1 file changed, 3 insertions(+), 4 deletions(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > index 66

Re: [PATCH 1/2] drm/amdgpu/atomfirmware: add function to update engine hang status

2017-05-08 Thread Harry Wentland
On 2017-05-08 02:32 PM, Alex Deucher wrote: On Fri, May 5, 2017 at 10:27 AM, Alex Deucher wrote: Update the scratch reg for when the engine is hung. Signed-off-by: Alex Deucher ping on this series. I'm not an expert on this and haven't had a chance to look up the atomfirmware definition

Re: Soliciting DRM feedback on latest DC rework

2017-05-08 Thread Harry Wentland
On 2017-05-08 03:07 PM, Dave Airlie wrote: On 9 May 2017 at 04:54, Harry Wentland wrote: Hi Daniel, Thanks for taking the time to look at DC. I had a couple more questions/comments in regard to the patch you posted on IRC: http://paste.debian.net/plain/930704 My impression is that this ite

Re: Soliciting DRM feedback on latest DC rework

2017-05-08 Thread Dave Airlie
On 9 May 2017 at 04:54, Harry Wentland wrote: > Hi Daniel, > > Thanks for taking the time to look at DC. > > I had a couple more questions/comments in regard to the patch you posted on > IRC: http://paste.debian.net/plain/930704 > > My impression is that this item is the most important next step f

Re: Soliciting DRM feedback on latest DC rework

2017-05-08 Thread Harry Wentland
Hi Daniel, Thanks for taking the time to look at DC. I had a couple more questions/comments in regard to the patch you posted on IRC: http://paste.debian.net/plain/930704 My impression is that this item is the most important next step for us: From a quick glance I think what we want ins

Re: [PATCH] drm/amdgpu: remove unsed amdgpu_gem_handle_lockup

2017-05-08 Thread Alex Deucher
On Mon, May 8, 2017 at 9:25 AM, Christian König wrote: > From: Christian König > > This kind of reset handling was removed a long time ago. > > Signed-off-by: Christian König Reviewed-by: Alex Deucher > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c | 45 > - >

Re: [PATCH 1/2] drm/amdgpu/atomfirmware: add function to update engine hang status

2017-05-08 Thread Alex Deucher
On Fri, May 5, 2017 at 10:27 AM, Alex Deucher wrote: > Update the scratch reg for when the engine is hung. > > Signed-off-by: Alex Deucher ping on this series. Alex > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_atomfirmware.c | 13 + > drivers/gpu/drm/amd/amdgpu/amdgpu_atomfirmware.h

Re: [PATCH] iommu/amd: flush IOTLB for specific domains only

2017-05-08 Thread Daniel Drake
On Wed, Apr 5, 2017 at 9:01 AM, Nath, Arindam wrote: > > >-Original Message- > >From: Daniel Drake [mailto:dr...@endlessm.com] > >Sent: Thursday, March 30, 2017 7:15 PM > >To: Nath, Arindam > >Cc: j...@8bytes.org; Deucher, Alexander; Bridgman, John; amd- > >g...@lists.freedesktop.org; io..

[PATCH] gpu: drm: amd: amdgpu: remove dead code

2017-05-08 Thread Gustavo A. R. Silva
Local variable use_doorbell is assigned to a constant value and it is never updated again. Remove this variable and the dead code it guards. Addresses-Coverity-ID: 1401828 Signed-off-by: Gustavo A. R. Silva --- drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c | 53 +-- 1 fil

RE: [PATCH] drm/amdgpu: fix errors in comments.

2017-05-08 Thread Deucher, Alexander
> -Original Message- > From: amd-gfx [mailto:amd-gfx-boun...@lists.freedesktop.org] On Behalf > Of Alex Xie > Sent: Monday, May 08, 2017 11:32 AM > To: amd-gfx@lists.freedesktop.org > Cc: Xie, AlexBin > Subject: [PATCH] drm/amdgpu: fix errors in comments. > > Signed-off-by: Alex Xie Re

[PATCH] gpu: drm: amd: amdgpu: remove dead code

2017-05-08 Thread Gustavo A. R. Silva
Local variable use_doorbell is assigned to a constant value and it is never updated again. Remove this variable and the dead code it guards. Addresses-Coverity-ID: 1401837 Signed-off-by: Gustavo A. R. Silva --- drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c | 20 ++-- 1 file changed, 6 in

Re: [PATCH] drm/amdgpu/gfx6: flush caches after IB with the correct vmid

2017-05-08 Thread Nicolai Hähnle
Unfortunately, further testing shows that this doesn't actually fix the problem. FWIW, that test runs very reliably on SI with the radeon drm, but with the amdgpu drm it fails. VI is fine on amdgpu, which is why I was sent down this road. Anyway, back to trying to figure this out :/ Cheers, N

RE: [PATCH 4/8] drm/amdgpu: convert kiq ring_mutex to a spinlock

2017-05-08 Thread Andres Rodriguez
On 2017-05-08 02:08 AM, Liu, Monk wrote: > Andres > > Some previous patches like move KIQ mutex-lock from amdgpu_virt to common > place jumped my NAK, but from technique perspective it's no matter anyway, > But this patch and the following patches are go to a dead end, > > 1, Don't use KIQ

[PATCH] drm/amdgpu: fix errors in comments.

2017-05-08 Thread Alex Xie
Signed-off-by: Alex Xie --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 7 +++ 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c index 66bb60e..aab3206 100644 --- a/drivers/gpu/drm/amd/amdgpu

[PATCH] drm/amdgpu: remove unsed amdgpu_gem_handle_lockup

2017-05-08 Thread Christian König
From: Christian König This kind of reset handling was removed a long time ago. Signed-off-by: Christian König --- drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c | 45 - 1 file changed, 11 insertions(+), 34 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ge

Re: [PATCH 4/4] drm/amdgpu/SRIOV:implement guilty job TDR (V2)

2017-05-08 Thread Christian König
Am 08.05.2017 um 09:01 schrieb Liu, Monk: @Christian This one is changed to guilty job scheme accordingly with your response BR Monk -Original Message- From: Monk Liu [mailto:monk@amd.com] Sent: Monday, May 08, 2017 3:00 PM To: amd-gfx@lists.freedesktop.org Cc: Liu, Monk Subject:

Re: [PATCH 1/4] drm/amdgpu:don't invoke srio-gpu-reset in gpu-reset

2017-05-08 Thread Christian König
Because we can always rely on TDR and HYPERVISOR to detect GPU hang and resubmit malicious jobs or even kick them out later, and the gpu reset will eventually be invoked, so there is no reason to manually and voluntarily call gpu reset under SRIOV case. Well there is a rather good reason, we det

RE: [PATCH 1/4] drm/amdgpu:don't invoke srio-gpu-reset in gpu-reset

2017-05-08 Thread Liu, Monk
The VM fault interrupt or illegal instruction will be delivered to GPU no matter it's SR-IOV or bare-metal case, And I removed them from invoking GPU reset is due to the same reason: Don't trigger gpu reset for sriov case if possible, always beware that trigger GPU reset under SR-IOV is a heavy

Re: [PATCH 1/4] drm/amdgpu:don't invoke srio-gpu-reset in gpu-reset

2017-05-08 Thread Christian König
Sounds good, but what do we do with the amdgpu_irq_reset_work_func? Please note that I find that calling amdgpu_gpu_reset() here is a bad idea in the first place. Instead we should consider the scheduler as faulting and let the scheduler handle that as in the same way as a job timeout. But

Re: [PATCH] drm/amdgpu:no debugfs_gpu_reset for SRIOV

2017-05-08 Thread Christian König
Am 08.05.2017 um 11:28 schrieb Monk Liu: Change-Id: Ie9730852da54ceb8b4c2c44acac2df3556a32d17 Signed-off-by: Monk Liu --- drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c | 9 +++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c b/drivers

[PATCH] drm/amdgpu/gfx6: flush caches after IB with the correct vmid

2017-05-08 Thread Nicolai Hähnle
From: Nicolai Hähnle Bring the code in line with what the radeon module does. Without this change, the fence following the IB may be signalled to the CPU even though some data written by shaders may not have been written back yet. This change fixes the OpenGL CTS test GL45-CTS.gtf32.GL3Tests.pa

[PATCH] drm/amdgpu:no debugfs_gpu_reset for SRIOV

2017-05-08 Thread Monk Liu
Change-Id: Ie9730852da54ceb8b4c2c44acac2df3556a32d17 Signed-off-by: Monk Liu --- drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c | 9 +++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c index fe17

RE: [PATCH 1/4] drm/amdgpu:don't invoke srio-gpu-reset in gpu-reset

2017-05-08 Thread Liu, Monk
I agree with disabling debugfs for amdgpu_reset when SRIOV detected. -Original Message- From: Christian König [mailto:deathsim...@vodafone.de] Sent: Monday, May 08, 2017 5:20 PM To: Liu, Monk ; amd-gfx@lists.freedesktop.org Subject: Re: [PATCH 1/4] drm/amdgpu:don't invoke srio-gpu-reset

Re: [PATCH 1/4] drm/amdgpu:don't invoke srio-gpu-reset in gpu-reset

2017-05-08 Thread Christian König
You know that gpu reset under SR-IOV will have very big impact on all other VFs ... Mhm, good argument. But in this case we need to give at least some warning message instead of doing nothing. Or even better disable creating the amdgpu_reste debugfs file altogether. This way nobody will wonde

RE: [PATCH 3/4] drm/amdgpu:only call flr_work under infinite timeout

2017-05-08 Thread Liu, Monk
yeah my mistake, thanks for catch -Original Message- From: Christian König [mailto:deathsim...@vodafone.de] Sent: Monday, May 08, 2017 5:11 PM To: Liu, Monk ; amd-gfx@lists.freedesktop.org Subject: Re: [PATCH 3/4] drm/amdgpu:only call flr_work under infinite timeout Am 08.05.2017 um 08:

Re: [PATCH 3/4] drm/amdgpu:only call flr_work under infinite timeout

2017-05-08 Thread Christian König
Am 08.05.2017 um 08:51 schrieb Monk Liu: Change-Id: I541aa5109f4fcab06ece4761a09dc7e053ec6837 Signed-off-by: Monk Liu --- drivers/gpu/drm/amd/amdgpu/mxgpu_vi.c | 15 +-- 1 file changed, 9 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/mxgpu_vi.c b/drivers/

RE: [PATCH 1/4] drm/amdgpu:don't invoke srio-gpu-reset in gpu-reset

2017-05-08 Thread Liu, Monk
For SR-IOV use case, we call gpu reset under the case we have no choice ... So many places like debug fs shouldn't a good reason to trigger gpu reset You know that gpu reset under SR-IOV will have very big impact on all other VFs ... BR Monk -Original Message- From: Christian König [ma

Re: [PATCH 2/4] drm/amdgpu:use job* to replace voluntary

2017-05-08 Thread Christian König
Am 08.05.2017 um 08:51 schrieb Monk Liu: that way we can know which job cause hang and can do per sched reset/recovery instead of all sched. Change-Id: Ifc98cd74b2d93823c489de6a89087ba188957eff Signed-off-by: Monk Liu Reviewed-by: Christian König --- drivers/gpu/drm/amd/amdgpu/amdgpu_dev

Re: [PATCH 1/4] drm/amdgpu:don't invoke srio-gpu-reset in gpu-reset

2017-05-08 Thread Christian König
Am 08.05.2017 um 08:51 schrieb Monk Liu: because we don't want to do sriov-gpu-reset under certain cases, so just split those two funtion and don't invoke sr-iov one from bare-metal one. Change-Id: I641126c241e2ee2dfd54e6d16c389b159f99cfe0 Signed-off-by: Monk Liu --- drivers/gpu/drm/amd/amdgp

[PATCH 4/4] drm/amdgpu/SRIOV:implement guilty job TDR (V2)

2017-05-08 Thread Liu, Monk
@Christian This one is changed to guilty job scheme accordingly with your response BR Monk -Original Message- From: Monk Liu [mailto:monk@amd.com] Sent: Monday, May 08, 2017 3:00 PM To: amd-gfx@lists.freedesktop.org Cc: Liu, Monk Subject: [PATCH] drm/amdgpu/SRIOV:implement guilty

RE: [PATCH 4/4] drm/amdgpu/SRIOV:implement guilty job TDR for

2017-05-08 Thread Liu, Monk
Sorry , drop this one, this one doesn't remove debug code Send another one after cleanups. -Original Message- From: amd-gfx [mailto:amd-gfx-boun...@lists.freedesktop.org] On Behalf Of Monk Liu Sent: Monday, May 08, 2017 2:51 PM To: amd-gfx@lists.freedesktop.org Cc: Liu, Monk Subject: [

[PATCH] drm/amdgpu/SRIOV:implement guilty job TDR (V2)

2017-05-08 Thread Monk Liu
1,TDR will kickout guilty job if it hang exceed the threshold of the given one from kernel paramter "job_hang_limit", that way a bad command stream will not infinitly cause GPU hang. by default this threshold is 1 so a job will be kicked out after it hang. 2,if a job timeout TDR routine will not