[PATCH] drm/amdgpu: remove pasid_src field from IV entry

2023-04-27 Thread Xiaomeng Hou
PASID_SRC is not actually present in the Interrupt Packet, the field is taken as reserved bits now. So remove it from IV entry to avoid misuse. Signed-off-by: Xiaomeng Hou --- drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c | 1 - drivers/gpu/drm/amd/amdgpu/amdgpu_irq.h | 1 - 2 files changed, 2 deletio

[PATCH v2] drm/dp_mst: Clear MSG_RDY flag before sending new message

2023-04-27 Thread Wayne Lin
[Why] The sequence for collecting down_reply from source perspective should be: Request_n->repeat (get partial reply of Request_n->clear message ready flag to ack DPRX that the message is received) till all partial replies for Request_n are received->new Request_n+1. Now there is chance that drm_

Re: [PATCH V3 1/2] drm/radeon: Fix integer overflow in radeon_cs_parser_init

2023-04-27 Thread whitehat002 whitehat002
Hello, What is the current status of this patch, has it been applied? Alex Deucher 于2023年4月19日周三 21:49写道: > Applied. Thanks! > > Alex > > On Wed, Apr 19, 2023 at 8:24 AM Christian König > wrote: > > > > Am 19.04.23 um 14:20 schrieb hackyzh002: > > > The type of size is unsigned, if size is 0x

[PATCH 2/3] drm/amdgpu: don't output mes error message when gfx hang during gpu reset

2023-04-27 Thread YiPeng Chai
This patch is to clear the invalid mes error message when gfx ras poison consumption causes gpu reset on gfx v11_0_3. [Why]: Gfx ras poison consumption will cause gfx hang, and gfx hang will cause mes to fail to run, and gfx can not be recovered until gpu reset complete. So the mes error mes

[PATCH 1/3] drm/amdgpu: add variable to record gpu reset reason

2023-04-27 Thread YiPeng Chai
Add variable to record gpu reset reason. Signed-off-by: YiPeng Chai --- drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h | 3 +++ drivers/gpu/drm/amd/amdgpu/gfx_v11_0_3.c | 6 +- 2 files changed, 8 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h b/drivers/gpu/drm/

[PATCH 3/3] drm/amdgpu: adjust gpu reset sequence for gfx v11_0_3

2023-04-27 Thread YiPeng Chai
When gfx ras poison consumption causes gpu reset on gfx v11_0_3, the sequence of gpu reset is "soft reset -> mode2 reset -> mode1 reset". If the previous reset fails, fall back to the next reset. Signed-off-by: YiPeng Chai --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 40 -

Re: [PATCH] drm/amdgpu: drop redudant sched job cleanup when cs is aborted

2023-04-27 Thread Christian König
Am 27.04.23 um 03:40 schrieb Guchun Chen: Once command submission failed due to userptr invalidation in amdgpu_cs_submit, legacy code will perform cleanup of scheduler job. However, it's not needed at all, as f7d66fb2ea43 has integrated job cleanup stuff into amdgpu_job_free. Otherwise, because o

RE: [PATCH 1/8] drm/scheduler: properly forward fence errors

2023-04-27 Thread Yin, ZhenGuo (Chris)
[AMD Official Use Only - General] Hi, Christian diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c index fcd4bfef7415..649fac2e1ccb 100644 --- a/drivers/gpu/drm/scheduler/sched_main.c +++ b/drivers/gpu/drm/scheduler/sched_main.c @@ -533,12 +533,12 @@ voi

Re: [PATCH 08/12] drm/amdgpu/gfx8: always restore kcq MQDs

2023-04-27 Thread Christian König
Am 26.04.23 um 23:21 schrieb Alex Deucher: Always restore the MQD not just when we do a reset. This allows us to move the MQD to VRAM if we want. Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c | 12 ++-- 1 file changed, 6 insertions(+), 6 deletions(-) diff

Re: [PATCH] drm/amdgpu: Recover vram from vmbo->shadow rather than vmbo->bo

2023-04-27 Thread Christian König
Am 27.04.23 um 05:23 schrieb Lin.Cao: Vmbo->shadow is used to back vram bo up when vram lost. So that we should set shadow as vmbo->shadow to recover vmbo->bo. Good catch. Fix: 'commit e18aaea733da ("drm/amdgpu: move shadow_list to amdgpu_bo_vm")' Signed-off-by: Lin.Cao --- drivers/gpu/d

Re: [PATCH 1/8] drm/scheduler: properly forward fence errors

2023-04-27 Thread Christian König
Well good point, but as part of the effort of the Intel team to move the scheduler over to a work item based design those two functions are probably about to be removed. Since we will probably have that in the internal package for a bit longer I'm going to send a fix for this. Regards, Chris

Re: [PATCH v3 2/2] drm/amdgpu: Fix integer overflow in amdgpu_cs_pass1

2023-04-27 Thread whitehat002 whitehat002
hello What is the current status of this patch, has it been applied? hackyzh002 于2023年4月19日周三 20:23写道: > > The type of size is unsigned int, if size is 0x4000, there will > be an integer overflow, size will be zero after size *= sizeof(uint32_t), > will cause uninitialized memory to be refer

Re: [PATCH v3 2/2] drm/amdgpu: Fix integer overflow in amdgpu_cs_pass1

2023-04-27 Thread Alex Deucher
As per my prior reply, it has been applied. Thanks, Alex On Thu, Apr 27, 2023 at 8:39 AM whitehat002 whitehat002 wrote: > > hello > What is the current status of this patch, has it been applied? > > > hackyzh002 于2023年4月19日周三 20:23写道: > > > > The type of size is unsigned int, if size is 0x4000

RE: [PATCH] drm/amdgpu: disable SDMA WPTR_POLL_ENABLE for SR-IOV

2023-04-27 Thread Deucher, Alexander
[AMD Official Use Only - General] > -Original Message- > From: Horace Chen > Sent: Thursday, April 27, 2023 2:24 AM > To: amd-gfx@lists.freedesktop.org > Cc: Andrey Grodzovsky ; Quan, Evan > ; Chen, Horace ; Koenig, > Christian ; Deucher, Alexander > ; Xiao, Jack ; Zhang, > Hawking ; Liu,

[PATCH Review 1/1] drm/amdgpu: Add SDMA_UTCL1_WR_FIFO_SED field for sdma_v4_4_ras_field

2023-04-27 Thread Stanley . Yang
Signed-off-by: Stanley.Yang --- drivers/gpu/drm/amd/amdgpu/sdma_v4_4.c | 4 1 file changed, 4 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v4_4.c b/drivers/gpu/drm/amd/amdgpu/sdma_v4_4.c index 6f9895cdddb1..0ddb6955a6d3 100644 --- a/drivers/gpu/drm/amd/amdgpu/sdma_v4_4.c +++

[PATCH] drm/amd/amdgpu: Simplify switch case statements in amdgpu_connectors.c

2023-04-27 Thread Srinivasan Shanmugam
Fix the following checkpatch errors: ERROR: trailing statements should be on next line ERROR: space required after that ',' (ctx:VxV) ERROR: code indent should use tabs where possible Cc: Christian König Cc: Alex Deucher Signed-off-by: Srinivasan Shanmugam --- .../gpu/drm/amd/amdgpu/amdgpu_co

RE: [PATCH Review 1/1] drm/amdgpu: Add SDMA_UTCL1_WR_FIFO_SED field for sdma_v4_4_ras_field

2023-04-27 Thread Zhang, Hawking
[AMD Official Use Only - General] Please add commit description. Apart from that, the change is Reviewed-by: Hawking Zhang Regards, Hawking -Original Message- From: Stanley.Yang Sent: Thursday, April 27, 2023 21:19 To: amd-gfx@lists.freedesktop.org; Zhang, Hawking ; Zhou1, Tao Cc: Y

[PATCH 01/12] drm/amdgpu/gfx11: drop old bring up code

2023-04-27 Thread Alex Deucher
No longer used. Remove it. Reviewed-by: Hawking Zhang Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c | 70 ++ 1 file changed, 3 insertions(+), 67 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_

[PATCH 03/12] drm/amdgpu: add [en/dis]able_kgq() functions

2023-04-27 Thread Alex Deucher
To replace the IP specific variants which are largely duplicate. Reviewed-by: Hawking Zhang Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c | 68 + drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h | 2 + 2 files changed, 70 insertions(+) diff --git a/dri

[PATCH 04/12] drm/amdgpu/gfx10: use generic [en/dis]able_kgq() helpers

2023-04-27 Thread Alex Deucher
And remove the duplicate local variants. Reviewed-by: Hawking Zhang Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 48 ++ 1 file changed, 2 insertions(+), 46 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c b/drivers/gpu/drm/am

[PATCH 07/12] drm/amdgpu/gfx11: drop unused variable

2023-04-27 Thread Alex Deucher
Just check the return value directly. Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c index d36d365cb582..256014a8c824 10

[PATCH 06/12] drm/amdgpu/gfx10: drop unused variable

2023-04-27 Thread Alex Deucher
Just check the return value directly. Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c index 24d7134228b0..5c67c91c4297 10

[PATCH 02/12] drm/amdgpu/gfx10: drop old bring up code

2023-04-27 Thread Alex Deucher
No longer used. Remove it. Reviewed-by: Hawking Zhang Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 70 ++ 1 file changed, 3 insertions(+), 67 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_

[PATCH 09/12] drm/amdgpu/gfx9: always restore kcq MQDs

2023-04-27 Thread Alex Deucher
Always restore the MQD not just when we do a reset. This allows us to move the MQD to VRAM if we want. v2: always reset ring pointer as well (Christian) Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 7 ++- drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c | 7 ++- 2 fi

[PATCH 08/12] drm/amdgpu/gfx8: always restore kcq MQDs

2023-04-27 Thread Alex Deucher
Always restore the MQD not just when we do a reset. This allows us to move the MQD to VRAM if we want. v2: always reset ring pointer as well (Christian) Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c | 6 ++ 1 file changed, 2 insertions(+), 4 deletions(-) diff --git

[PATCH 05/12] drm/amdgpu/gfx11: use generic [en/dis]able_kgq() helpers

2023-04-27 Thread Alex Deucher
And remove the duplicate local variants. Reviewed-by: Hawking Zhang Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c | 49 ++ 1 file changed, 2 insertions(+), 47 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c b/drivers/gpu/drm/am

[PATCH 12/12] drm/amdgpu: put MQDs in VRAM

2023-04-27 Thread Alex Deucher
Reduces preemption latency. v2: move MES MQDs into VRAM as well (YuBiao) v3: enable on gfx10, 11 only (Alex) Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c | 4 drivers/gpu/drm/amd/amdgpu/mes_v10_1.c | 1 + drivers/gpu/drm/amd/amdgpu/mes_v11_0.c | 1 + 3 files ch

[PATCH 10/12] drm/amdgpu/gfx10: always restore kcq/kgq MQDs

2023-04-27 Thread Alex Deucher
Always restore the MQD not just when we do a reset. This allows us to move the MQD to VRAM if we want. v2: always reset ring pointer as well (Christian) Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 13 - 1 file changed, 4 insertions(+), 9 deletions(-) di

[PATCH 11/12] drm/amdgpu/gfx11: always restore kcq/kgq MQDs

2023-04-27 Thread Alex Deucher
Always restore the MQD not just when we do a reset. This allows us to move the MQD to VRAM if we want. v2: always reset ring pointer as well (Christian) Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c | 13 - 1 file changed, 4 insertions(+), 9 deletions(-) di

Re: [PATCH] drm/amdkfd: Optimize svm range map to GPU with XNACK on

2023-04-27 Thread Felix Kuehling
On 2023-04-24 14:38, Philip Yang wrote: With XNACK on if svm_range_set_attr set the range access or access_in_place attribute, we don't call svm_range_validate_and_map to update GPU mapping. This avoids prefaulting the range pages on system memory if the range is not prefetch to VRAM and not ma

[PATCH v2 0/9] drm: fdinfo memory stats

2023-04-27 Thread Rob Clark
From: Rob Clark Similar motivation to other similar recent attempt[1]. But with an attempt to have some shared code for this. As well as documentation. It is probably a bit UMA-centric, I guess devices with VRAM might want some placement stats as well. But this seems like a reasonable start.

[PATCH v2 4/9] drm/amdgpu: Switch to fdinfo helper

2023-04-27 Thread Rob Clark
From: Rob Clark Signed-off-by: Rob Clark Reviewed-by: Christian König --- drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c| 3 ++- drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c | 16 ++-- drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.h | 2 +- 3 files changed, 9 insertions(+), 12 deletions(-)

[PATCH 1/2] drm/amdgpu: drop invalid IP revision

2023-04-27 Thread Alex Deucher
This was already fixed and dropped in: commit baf3f8f37406 ("drm/amdgpu: handle SRIOV VCN revision parsing") commit c40bdfb2ffa4 ("drm/amdgpu: fix incorrect VCN revision in SRIOV") But seems to have been accidently been left around in a merge. Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/

[PATCH 2/2] drm/amdgpu: drop unused function

2023-04-27 Thread Alex Deucher
amdgpu_discovery_get_ip_version() has not been used since commit c40bdfb2ffa4 ("drm/amdgpu: fix incorrect VCN revision in SRIOV") so drop it. Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c | 48 --- drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.h | 2

Re: [PATCH v3] drm/amdgpu: add a missing lock for AMDGPU_SCHED

2023-04-27 Thread Alex Deucher
Applied. Thanks! Alex On Wed, Apr 26, 2023 at 6:55 PM Chia-I Wu wrote: > > mgr->ctx_handles should be protected by mgr->lock. > > v2: improve commit message > v3: add a Fixes tag > > Signed-off-by: Chia-I Wu > Reviewed-by: Christian König > Fixes: 52c6a62c64fac ("drm/amdgpu: add interface for

RE: [PATCH] drm/amd/amdgpu: Simplify switch case statements in amdgpu_connectors.c

2023-04-27 Thread Deucher, Alexander
[Public] > -Original Message- > From: SHANMUGAM, SRINIVASAN > > Sent: Thursday, April 27, 2023 11:02 AM > To: Koenig, Christian ; Deucher, Alexander > > Cc: amd-gfx@lists.freedesktop.org; SHANMUGAM, SRINIVASAN > > Subject: [PATCH] drm/amd/amdgpu: Simplify switch case statements in > amd

Re: [PATCH] drm/amdgpu: Ignore KFD eviction fences invalidating preemptible DMABuf imports

2023-04-27 Thread Eric Huang
Hi Felix, I tested your patch on mGPU systems. It doesn't break any KFD eviction tests, because tests don't allocate DMABuf import, that doesn't trigger it's eviction fence. The only thing the patch affects is in re-mapping DMABuf imports that the eviction will still be triggered. I have an

RE: [PATCH] drm/amdgpu: remove pasid_src field from IV entry

2023-04-27 Thread Liu, Aaron
[AMD Official Use Only - General] Good catch! The PASID_SRC bit is only used in IH_COOKIE which is sent as register write to the IH by IH_client. But in the interrupt packet from IH to driver, the corresponding bit is always reserved. PASID_SRC is not to be used for driver. Reviewed-by: Aaron L

RE: [PATCH] drm/amdgpu: Ignore KFD eviction fences invalidating preemptible DMABuf imports

2023-04-27 Thread Kuehling, Felix
[AMD Official Use Only - General] Re-mapping typically happens after evictions, before a new eviction fence gets attached. At that time the old eviction fence should be in the signaled state already, so it can't be signaled again. Therefore I would expect my patch to help with unmapping the DMA

[PATCH v4] drm/amdgpu: drop gfx_v11_0_cp_ecc_error_irq_funcs

2023-04-27 Thread Horatio Zhang
The gfx.cp_ecc_error_irq is retired in gfx11. In gfx_v11_0_hw_fini still use amdgpu_irq_put to disable this interrupt, which caused the call trace in this function. [ 102.873958] Call Trace: [ 102.873959] [ 102.873961] gfx_v11_0_hw_fini+0x23/0x1e0 [amdgpu] [ 102.874019] gfx_v11_0_suspend+0

[linux-next:master] BUILD SUCCESS WITH WARNING 84e2893b4573da3bc0c9f24e2005442e420e3831

2023-04-27 Thread kernel test robot
tree/branch: https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git master branch HEAD: 84e2893b4573da3bc0c9f24e2005442e420e3831 Add linux-next specific files for 20230427 Warning reports: https://lore.kernel.org/oe-kbuild-all/202304210303.nlmi0srq-...@intel.com https

RE: [PATCH v4] drm/amdgpu: drop gfx_v11_0_cp_ecc_error_irq_funcs

2023-04-27 Thread Zhang, Hawking
[AMD Official Use Only - General] + if (!(adev->gfx.cp_ecc_error_irq.funcs == NULL)) { Just check if (adev->gfx.cp_error_irq.funcs) { + r = amdgpu_irq_get(adev, &adev->gfx.cp_ecc_error_irq, 0); + if (r) + got

[PATCH] drm/amdgpu: Enable mcbp under sriov by default

2023-04-27 Thread YuBiao Wang
Enable mcbp under sriov by default. Asics with soc21 supports mcbp now so we should set it enabled. Signed-off-by: YuBiao Wang --- drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c b/drivers/

RE: [PATCH] drm/amdgpu: Enable mcbp under sriov by default

2023-04-27 Thread Chen, Horace
[AMD Official Use Only - General] Reviewed-By: Horace Chen -Original Message- From: amd-gfx On Behalf Of YuBiao Wang Sent: Friday, April 28, 2023 2:05 PM To: amd-gfx@lists.freedesktop.org Cc: Wang, YuBiao ; Xu, Feifei ; Chen, Horace ; Kevin Wang ; Tuikov, Luben ; Deucher, Alexander ;