RE: [PATCH 2/3] drm/amdgpu: drop the sched_sync

2018-11-05 Thread Liu, Monk
The bug I hit is because in my local branch that vmid counter logic is changed ( I optimized unnecessary vm-flush after gpu recover), Christian is right on that, So with drm-next branch my bug won't hit, I'm fine with sched_dep there as long as not such issue. Thanks /Monk -Original Messa

RE: [PATCH 2/3] drm/amdgpu: drop the sched_sync

2018-11-05 Thread Liu, Monk
> So you have a pipeline sync when you don't need one and that is really really > bad for things shared between processes, e.g. X/Wayland and it's clients. Oh, that may explain the thing here: My environment is a no-X-window system (customer's cloud gaming user case), so I don't launch X at all,

Re: [PATCH libdrm 5/5] [libdrm] add syncobj timeline tests

2018-11-05 Thread Daniel Vetter
On Fri, Nov 02, 2018 at 04:26:49PM +0800, Chunming Zhou wrote: > Signed-off-by: Chunming Zhou > --- > tests/amdgpu/Makefile.am | 3 +- > tests/amdgpu/amdgpu_test.c | 12 ++ > tests/amdgpu/amdgpu_test.h | 21 +++ > tests/amdgpu/meson.build | 2 +- > tests/amdgpu/syncobj_tests.c |

RE: [PATCH libdrm 5/5] [libdrm] add syncobj timeline tests

2018-11-05 Thread Zhou, David(ChunMing)
> -Original Message- > From: Daniel Vetter On Behalf Of Daniel Vetter > Sent: Monday, November 05, 2018 5:39 PM > To: Zhou, David(ChunMing) > Cc: dri-de...@lists.freedesktop.org; amd-gfx@lists.freedesktop.org > Subject: Re: [PATCH libdrm 5/5] [libdrm] add syncobj timeline tests > > On

Re: [PATCH 2/3] drm/amdgpu: drop the sched_sync

2018-11-05 Thread Koenig, Christian
> BTW: could we let the Job remember the hw fence seq that it need to sync up > to ? e.g. in "drm_sched_entity_clear_dep" we not only wake up scheduler but > also set the hw fence seq number to the job (and keep the big one), so in the > end in amdgpu_ib_schedule(), we knows exactly the last seq

[PATCH 0/3] RLC kernel code update to improve resuabillity

2018-11-05 Thread likun Gao
From: Likun Gao Hi, Those series of patch modified the code of RLC to improve the resuabillity of RLC's code. The process was separate into three part: Part1[PATCH 1/3]: Unify RLC's function into the struct amdgpu_rlc_funcs and change the method of calling RLC. Part2[PATCH 2/3]: Abstract RLC's

[PATCH 1/3] drm/amdgpu: unify rlc function into structure

2018-11-05 Thread likun Gao
From: Likun Gao Put function rlc_init,rlc_fini,rlc_resume,rlc_stop,rlc_start into structure amdgpu_rlc_funcs and change the method to call rlc function for each verssion of GFX. Signed-off-by: Likun Gao --- drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h | 6 ++ drivers/gpu/drm/amd/amdgpu/gfx_v6_

[PATCH 3/3] drm/amdgpu: separate amdgpu_rlc into a single file.

2018-11-05 Thread likun Gao
From: Likun Gao separate the function and struct of RLC from the file of GFX Signed-off-by: Likun Gao --- drivers/gpu/drm/amd/amdgpu/Makefile | 1 + drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c | 202 +--- drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h | 70 +- drive

[PATCH 2/3] drm/amdgpu: abstract the function of enter/exit safe mode for RLC.

2018-11-05 Thread likun Gao
From: Likun Gao abstract the function of amdgpu_gfx_rlc_enter/exit_safe_mode, amdgpu_gfx_rlc_fini and some part of rlc_init to improve the reusability of RLC. Signed-off-by: Likun Gao --- drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c| 201 drivers/gpu/drm/amd/amdgpu/

Re: [PATCH 0/3] RLC kernel code update to improve resuabillity

2018-11-05 Thread Christian König
Hi Likun, I'm not deep enough into what and how the RLC does it's job to completely judge but in general that looks like a really nice cleanup. Just two notes: 1. Don't add function first to amdgpu_gfx.[ch] and in the next patch move them over to amdgpu_rlc.[ch]. Just add them directly to the

RE: [PATCH 2/3] drm/amdgpu: drop the sched_sync

2018-11-05 Thread Liu, Monk
Hi Christian For scenario: Bad Job (hang, vmid1) -->Job A (context 10, explicit dep for Job B, vmid2) --> Job B(context 10, vmid2) --> Job C (context 11, vmid3) Assume "job_hang_limit" is 0, and assume "sched_hw_submission" is 4, I give a second thought on the logic after GPU reset: 1) the ba

RE: [PATCH 2/3] drm/amdgpu: drop the sched_sync

2018-11-05 Thread Liu, Monk
> That won't work correctly because the sequence wraps around from time to time > and you can't correctly handle that in PM4 AFAIK. Well maybe with two waits, > but not 100% sure. Hmmm, I still want to give a try, and I still feel current logic is wrong after TDR , see my reply in another email

Re: [PATCH 2/3] drm/amdgpu: drop the sched_sync

2018-11-05 Thread Koenig, Christian
> and later its VMID's "current_gpu_reset_count" is updated to > "adev->gpu_reset_count" The question is how much later that is done. My recollection is that we don't reset that for resubmission, but that could be wrong. Anyway I think the cleanest approach to always handle that correctly would

RE: [PATCH 2/3] drm/amdgpu: drop the sched_sync

2018-11-05 Thread Liu, Monk
> Anyway I think the cleanest approach to always handle that correctly would be > to always insert a vm flush before all jobs on resubmission. That is most likely better for VM flush handling as well. Yeah, that’s true and more simple /Monk -Original Message- From: Koenig, Christian S

RE: [PATCH 2/3] drm/amdgpu: drop the sched_sync

2018-11-05 Thread Liu, Monk
> The question is how much later that is done. My recollection is that we don't > reset that for resubmission, but that could be wrong. According to my code (drm-next) VMID's "current_gpu_rest_counter" is updated in "vm_flush" routine, so it is always there for either resubmission or not ... Se

Re: [PATCH] drm/amd/display: Fix misleading buffer information

2018-11-05 Thread Li, Sun peng (Leo)
+amdgfx, amdgpu specific patches should go here On 2018-11-05 05:33 AM, Shaokun Zhang wrote: > RETIMER_REDRIVER_INFO shows the buffer as a decimal value with a '0x' > prefix, which is somewhat misleading. > > Fix it to print hexadecimal, as was intended. > > Fixes: 2f14bc89("drm/amd/display: add

Re: [PATCH] drm/amd/amdgpu/dm: Fix dm_dp_create_fake_mst_encoder()

2018-11-05 Thread Wentland, Harry
On 2018-11-01 9:51 p.m., Lyude Paul wrote: > [why] > Removing connector reusage from DM to match the rest of the tree ended > up revealing an issue that was surprisingly subtle. The original amdgpu > code for DC that was submitted appears to have left a chunk in > dm_dp_create_fake_mst_encoder() th

Re: [PATCH] drm/radeon: ratelimit bo warnings

2018-11-05 Thread Nick Alcock
On 9 Oct 2018, Michel Dänzer stated: > On 2018-10-05 7:14 p.m., Nick Alcock wrote: >> On 5 Oct 2018, Michel Dänzer told this: >> >>> On 2018-10-04 9:58 p.m., Nick Alcock wrote: So a few days ago I started getting sprays of these warnings -- sorry, but because it was a few days ago I'm n

[PATCH 1/2] drm/amdgpu/display: check if fbc is available in set_static_screen_control

2018-11-05 Thread Alex Deucher
The value is dependent on whether fbc is available. Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c b/drivers/gpu/drm/am

[PATCH 2/2] drm/amdgpu/display: disable FBC

2018-11-05 Thread Alex Deucher
Causes a black screen on Stoney laptop. bug: https://bugs.freedesktop.org/show_bug.cgi?id=108577 Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/display/dc/dce110/dce110_resource.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/display/dc/dce110/dce

[PATCH 2/2] drm/amd/display: Stop leaking planes

2018-11-05 Thread Harry Wentland
[Why] drm_plane_cleanup does not free the plane. [How] Call drm_primary_helper_destroy which will also free the plane. Signed-off-by: Harry Wentland --- drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/disp

[PATCH 1/2] drm/amdgpu: Drop amdgpu_plane

2018-11-05 Thread Harry Wentland
It's unnecessarily duplicating drm_plane_type. Signed-off-by: Harry Wentland --- drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h | 8 +--- .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 38 +-- 2 files changed, 20 insertions(+), 26 deletions(-) diff --git a/drivers/gpu/drm/amd/a

RE: [PATCH 2/2] drm/amd/display: Stop leaking planes

2018-11-05 Thread Deucher, Alexander
> -Original Message- > From: amd-gfx On Behalf Of > Harry Wentland > Sent: Monday, November 5, 2018 3:45 PM > To: amd-gfx@lists.freedesktop.org > Cc: Wentland, Harry > Subject: [PATCH 2/2] drm/amd/display: Stop leaking planes > > [Why] > drm_plane_cleanup does not free the plane. > > [H

RE: [PATCH 1/2] drm/amdgpu: Drop amdgpu_plane

2018-11-05 Thread Deucher, Alexander
> -Original Message- > From: amd-gfx On Behalf Of > Harry Wentland > Sent: Monday, November 5, 2018 3:45 PM > To: amd-gfx@lists.freedesktop.org > Cc: Wentland, Harry > Subject: [PATCH 1/2] drm/amdgpu: Drop amdgpu_plane > > It's unnecessarily duplicating drm_plane_type. > > Signed-off-by

Re: [PATCH] drm/amdkfd: fix interrupt spin lock

2018-11-05 Thread Kuehling, Felix
On 2018-11-04 2:20 p.m., Christian König wrote: > Am 02.11.18 um 19:59 schrieb Kuehling, Felix: >> On 2018-11-02 9:48 a.m., Christian König wrote: >>> Vega10 has multiple interrupt rings, >> I don't think I've seen your code that implements multiple interrupt >> rings. So it's a bit hard to comment

Re: [PATCH 1/2] drm/amdgpu/display: check if fbc is available in set_static_screen_control

2018-11-05 Thread Wentland, Harry
On 2018-11-05 2:27 p.m., Alex Deucher wrote: > The value is dependent on whether fbc is available. > > Signed-off-by: Alex Deucher > --- > drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/amd/

Re: [PATCH 2/2] drm/amdgpu/display: disable FBC

2018-11-05 Thread Wentland, Harry
On 2018-11-05 2:27 p.m., Alex Deucher wrote: > Causes a black screen on Stoney laptop. > > bug: https://bugs.freedesktop.org/show_bug.cgi?id=108577 > Signed-off-by: Alex Deucher > --- > drivers/gpu/drm/amd/display/dc/dce110/dce110_resource.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion

Re: [PATCH v3 3/3] drm/amdgpu: Change powerplay clock requests to MHz

2018-11-05 Thread Wentland, Harry
On 2018-11-02 9:26 a.m., David Francis wrote: > This will clean up powerplay code, as we are no longer > multiplying the clocks by 1000 in DM and then dividing them > by 1000 in powerplay > > Signed-off-by: David Francis Series is Reviewed-by: Harry Wentland Harry > --- > drivers/gpu/drm/amd

[PATCH 0/9] KFD upstreaming Nov 2018, part 1

2018-11-05 Thread Kuehling, Felix
These are some recent patches that are easy to upstream (part 1). For part 2 (hopefully still this month) I'll need to advance the merging of KFD into amdgpu a little further to avoid upstreaming duplicated data structures that no longer need to be duplicated. Eric Huang (1): drm/amdkfd: change

[PATCH 1/9] drm/amdkfd: Replace mqd with mqd_mgr as the variable name for mqd_manager

2018-11-05 Thread Kuehling, Felix
From: Yong Zhao This will make reading code much easier. This fixes a few spots missed in a previous commit with the same title. Signed-off-by: Yong Zhao Reviewed-by: Felix Kuehling Signed-off-by: Felix Kuehling --- drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c | 12 ++-- 1 f

[PATCH 7/9] drm/amdkfd: Fix and simplify sync object handling for KFD

2018-11-05 Thread Kuehling, Felix
The adev parameter in amdgpu_sync_fence and amdgpu_sync_resv is only needed for updating sync->last_vm_update. This breaks if different adevs are passed to calls for the same sync object. Always pass NULL for calls from KFD because sync objects used for KFD don't belong to any particular device, a

[PATCH 5/9] drm/amdgpu: Remove explicit wait after VM validate

2018-11-05 Thread Kuehling, Felix
From: Harish Kasiviswanathan PD or PT might have to be moved during validation and this move has to be completed before updating it. If page table updates are done using SDMA then this serializing is done by SDMA command submission. And if PD/PT updates are done by CPU, then explicit waiting for

[PATCH 6/9] drm/amdgpu: KFD Restore process: Optimize waiting

2018-11-05 Thread Kuehling, Felix
From: Harish Kasiviswanathan Instead of waiting for each KFD BO after validation just wait for the last BO moving fence. Signed-off-by: Harish Kasiviswanathan Reviewed-by: Felix Kuehling Signed-off-by: Felix Kuehling --- drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 13 - 1

[PATCH 2/9] drm/amdkfd: Added Vega12 and Polaris12 for KFD.

2018-11-05 Thread Kuehling, Felix
From: Gang Ba Add Vega12 and Polaris12 device info and device IDs to KFD. Signed-off-by: Gang Ba Reviewed-by: Felix Kuehling Signed-off-by: Felix Kuehling --- drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 2 + drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 3 +- drivers/gpu/dr

[PATCH 3/9] drm/amdkfd: Adjust the debug message in KFD ISR

2018-11-05 Thread Kuehling, Felix
From: Yong Zhao This makes debug message get printed even when there is early return. Signed-off-by: Yong Zhao Reviewed-by: Felix Kuehling Signed-off-by: Felix Kuehling --- drivers/gpu/drm/amd/amdkfd/kfd_int_process_v9.c | 14 +++--- 1 file changed, 7 insertions(+), 7 deletions(-) d

[PATCH 8/9] drm/amdgpu: Fix KFD doorbell SG BO mapping

2018-11-05 Thread Kuehling, Felix
This change prepares for adding SG BOs that will be used for mapping doorbells into GPUVM address space. This type of BO would be mistaken for an invalid userptr BO. Improve that check to test that it's actually a userptr BO so that SG BOs that are still in the CPU domain can be validated and mapp

[PATCH 4/9] drm/amdkfd: Workaround PASID missing in gfx9 interrupt payload under non HWS

2018-11-05 Thread Kuehling, Felix
From: Yong Zhao This is a known gfx9 HW issue, and this change can perfectly workaround the issue. Signed-off-by: Yong Zhao Reviewed-by: Felix Kuehling Signed-off-by: Felix Kuehling --- drivers/gpu/drm/amd/amdkfd/kfd_int_process_v9.c | 25 ++--- 1 file changed, 22 inserti

[PATCH 9/9] drm/amdkfd: change system memory overcommit limit

2018-11-05 Thread Kuehling, Felix
From: Eric Huang It is to improve system limit by: 1. replacing userptrlimit with a total memory limit that conunts TTM memory usage and userptr usage. 2. counting acc size for all BOs. Signed-off-by: Eric Huang Reviewed-by: Felix Kuehling Signed-off-by: Felix Kuehling --- drivers/gpu/drm/am

[PATCH 11/11] drm/amdgpu/psp: add set_topology_info function

2018-11-05 Thread Alex Deucher
From: Hawking Zhang set_topology_info is used for driver to set current topology info to xgmi ta Signed-off-by: Hawking Zhang Reviewed-by: Shaoyun Liu Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/psp_v11_0.c | 24 +++- 1 file changed, 23 insertions(+), 1 del

[PATCH 03/11] drm/amdgpu/psp: add helper function to load/unload xgmi ta

2018-11-05 Thread Alex Deucher
From: Hawking Zhang Add helper functions for the psp xgmi ta. Signed-off-by: Hawking Zhang Acked-by: Alex Deucher Reviewed-by: Huang Rui Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 98 + 1 file changed, 98 insertions(+) diff --g

[PATCH 00/11] Add psp interface for xgmi

2018-11-05 Thread Alex Deucher
This adds the psp interface for xgmi discovery and configuration. Hawking Zhang (11): drm/amdgpu/psp: add structure for xgmi ta and its shared buffer drm/amdgpu/psp: init/de-init xgmi ta microcode drm/amdgpu/psp: add helper function to load/unload xgmi ta drm/amdgpu/psp: add xgmi ta header

[PATCH 10/11] drm/amdgpu/psp: add get_topology_info function

2018-11-05 Thread Alex Deucher
From: Hawking Zhang get_topology_info function is used for driver to query topology_info for current device from xgmi ta Signed-off-by: Hawking Zhang Reviewed-by: Shaoyun Liu Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/psp_v11_0.c | 39 ++ 1 fil

[PATCH 08/11] drm/amdgpu/psp: add get_hive_id function

2018-11-05 Thread Alex Deucher
From: Hawking Zhang get_hive_id is used for driver to query hive_id for current device from xgmi ta Signed-off-by: Hawking Zhang Acked-by: Alex Deucher Reviewed-by: Huang Rui Reviewed-by: Shaoyun Liu Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/psp_v11_0.c | 17 --

[PATCH 05/11] drm/amdgpu/psp: add helper function to invoke xgmi ta per ta cmd_id

2018-11-05 Thread Alex Deucher
From: Hawking Zhang psp_xgmi_invoke is the helper function to issue ta cmd to firmware Signed-off-by: Hawking Zhang Acked-by: Alex Deucher Reviewed-by: Huang Rui Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 36 + drivers/gpu/drm/a

[PATCH 01/11] drm/amdgpu/psp: add structure for xgmi ta and its shared buffer

2018-11-05 Thread Alex Deucher
From: Hawking Zhang Add data structures for xgmi trusted application. Signed-off-by: Hawking Zhang Acked-by: Alex Deucher Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h | 18 +- 1 file changed, 17 insertions(+), 1 deletion(-) diff --git a/drivers/gpu

[PATCH 02/11] drm/amdgpu/psp: init/de-init xgmi ta microcode

2018-11-05 Thread Alex Deucher
From: Hawking Zhang Add ucode handling for psp xgmi ta firmware. Signed-off-by: Hawking Zhang Acked-by: Alex Deucher Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 2 ++ drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.h | 12 ++ drivers/gpu/drm/amd/amdgpu/psp_v1

[PATCH 09/11] drm/amdgpu/psp: update topology info structures

2018-11-05 Thread Alex Deucher
From: Hawking Zhang topology info structure needs to match with the one defined in xgmi ta Signed-off-by: Hawking Zhang Reviewed-by: Shaoyun Liu Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h | 29 + drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi

[PATCH 07/11] drm/amdgpu/psp: add get_node_id function

2018-11-05 Thread Alex Deucher
From: Hawking Zhang get_node_id function is used for driver to get node_id for current device from xgmi ta Signed-off-by: Hawking Zhang Acked-by: Alex Deucher Reviewed-by: Huang Rui Reviewed-by: Shaoyun Liu Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h | 2 +- d

[PATCH 04/11] drm/amdgpu/psp: add xgmi ta header

2018-11-05 Thread Alex Deucher
From: Hawking Zhang Add the psp xgmi driver interface. Signed-off-by: Hawking Zhang Acked-by: Alex Deucher Reviewed-by: Huang Rui Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/ta_xgmi_if.h | 130 1 file changed, 130 insertions(+) create mode 10

[PATCH 06/11] drm/amdgpu/psp: initialize xgmi session (v2)

2018-11-05 Thread Alex Deucher
From: Hawking Zhang Setup and tear down xgmi as part of psp. v2: - make psp_xgmi_terminate static - squash in: drm/amdgpu: only issue xgmi cmd when it is enabled drm/amdgpu/psp: terminate xgmi ta in suspend and hw_fini phase Signed-off-by: Hawking Zhang Acked-by: Alex Deucher Reviewed-by: Hua