[PATCH] drm/amdgpu: Fix incorrect resource realease in amdgpu_init()

2024-09-06 Thread Jinjie Ruan
If pci_register_driver() fails, amdgpu_sync_slab and amdgpu_fence_slab should be freed in the error path, fix it. Signed-off-by: Jinjie Ruan Suggested-by: Thomas Gleixner --- drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 9 - 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/driv

[PATCH -next] drm/amdgpu/mes11: fix bad alignments

2024-09-06 Thread Jiapeng Chong
No functional modification involved. ./drivers/gpu/drm/amd/amdgpu/mes_v11_0.c:418:3-9: code aligned with following code on line 419. Reported-by: Abaci Robot Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=10742 Signed-off-by: Jiapeng Chong --- drivers/gpu/drm/amd/amdgpu/mes_v11_0.c |

[INFO] multi-gpu scenario

2024-09-06 Thread __- -__
Hi, I am trying to a way we can have a desktop performance compared to windows/macos on linux where system is optimus-enhanced (nvidia) focused. I have read about off-screen mesa. So ubuntu has gl_dispatcher strategy, why not make osmesa defaults and let gl_dispatcher take care of relative device

[PATCH] drm/amdkfd: Select reset method for poison handling

2024-09-06 Thread Hawking Zhang
Driver mode-2 is only supported by relative new smc firmware. Signed-off-by: Hawking Zhang --- .../gpu/drm/amd/amdkfd/kfd_int_process_v9.c | 40 +++ 1 file changed, 32 insertions(+), 8 deletions(-) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_int_process_v9.c b/drivers/gpu/drm

RE: [PATCH] drm/amdkfd: Select reset method for poison handling

2024-09-06 Thread Zhou1, Tao
[AMD Official Use Only - AMD Internal Distribution Only] Reviewed-by: Tao Zhou > -Original Message- > From: Hawking Zhang > Sent: Friday, September 6, 2024 4:13 PM > To: amd-gfx@lists.freedesktop.org; Zhou1, Tao > Cc: Zhang, Hawking > Subject: [PATCH] drm/amdkfd: Select reset method f

[PATCH] drm/amdgpu: Fix JPEG v4.0.3 register write

2024-09-06 Thread Lijo Lazar
EXTERNAL_REG_INTERNAL_OFFSET/EXTERNAL_REG_WRITE_ADDR should be used in pairs. If an external register shoudln't be written, both packets shouldn't be sent. Fixes: a78b48146972 ("drm/amdgpu: Skip PCTL0_MMHUB_DEEPSLEEP_IB write in jpegv4.0.3 under SRIOV") Signed-off-by: Lijo Lazar --- drivers/gpu

Re: [PATCH] drm/amdkfd: Select reset method for poison handling

2024-09-06 Thread Lazar, Lijo
On 9/6/2024 1:42 PM, Hawking Zhang wrote: > Driver mode-2 is only supported by relative new > smc firmware. > > Signed-off-by: Hawking Zhang > --- > .../gpu/drm/amd/amdkfd/kfd_int_process_v9.c | 40 +++ > 1 file changed, 32 insertions(+), 8 deletions(-) > > diff --git a/dri

[PATCH] drm/amdgpu: Fix missing check pcie_p2p module param

2024-09-06 Thread Bob Zhou
The module param pcie_p2p should be checked for kfd p2p feature, so add it. Fixes: a9b55f03989a ("drm/amdgpu: Take IOMMU remapping into account for p2p checks") Signed-off-by: Bob Zhou --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --g

[PATCH] drm/amd/pm: fix the pp_dpm_pcie issue on smu v14.0.2/3

2024-09-06 Thread Kenneth Feng
fix the pp_dpm_pcie issue on smu v14.0.2/3 as below: 0: 2.5GT/s, x4 250Mhz 1: 8.0GT/s, x4 616Mhz * 2: 8.0GT/s, x4 1143Mhz * the middle level can be removed since it is always skipped on smu v14.0.2/3 Signed-off-by: Kenneth Feng --- drivers/gpu/drm/amd/pm/swsmu/smu14/smu_v14_0_2_ppt.c | 3 +++ 1

RE: [PATCH] drm/amdkfd: Select reset method for poison handling

2024-09-06 Thread Zhang, Hawking
[AMD Official Use Only - AMD Internal Distribution Only] Right. That involves more changes from kfd to amdkfd interface to amdgpu ras interface. And need to consider reenabling unmap queue at some point. Let me think about more how to put these together and make it be part of the upcoming ras

Re: [PATCH] drm/amd/pm: fix the pp_dpm_pcie issue on smu v14.0.2/3

2024-09-06 Thread Alex Deucher
On Fri, Sep 6, 2024 at 8:52 AM Kenneth Feng wrote: > > fix the pp_dpm_pcie issue on smu v14.0.2/3 as below: > 0: 2.5GT/s, x4 250Mhz > 1: 8.0GT/s, x4 616Mhz * > 2: 8.0GT/s, x4 1143Mhz * > the middle level can be removed since it is always skipped on > smu v14.0.2/3 > > Signed-off-by: Kenneth Feng

Re: [PATCH] drm/amdgpu: always allocate cleared VRAM for GEM allocations

2024-09-06 Thread Marek Olšák
Can you also bump the DRM version, so that userspace knows when to skip its own clear? Also, clearing with SDMA takes up to 33 times more time (= is up to 97% slower) than clearing with compute. Marek On Thu, Aug 29, 2024 at 2:23 PM Paneer Selvam, Arunpravin wrote: > > this will fix performance

Re: [PATCH] drm/amdgpu: Raise dma resv usage for created TLB fence

2024-09-06 Thread Andjelkovic, Dejan
[AMD Official Use Only - AMD Internal Distribution Only] I might have worded that poorly, I meant that it seems like TLB flush is out of sync with the SDMA update, which leads to a page fault reliably. I don't feel it has anything to do with the implicit sync in itself. When TLB fence is create

Re: [PATCH v5.15-v5.10] drm/amd/pm: Fix the null pointer dereference for vega10_hwmgr

2024-09-06 Thread Mukul Sikka
On Fri, Sep 6, 2024 at 12:05 AM Alex Deucher wrote: > > On Tue, Sep 3, 2024 at 5:53 AM sikkamukul wrote: > > > > From: Bob Zhou > > > > [ Upstream commit 50151b7f1c79a09117837eb95b76c2de76841dab ] > > > > Check return value and conduct null pointer handling to avoid null pointer > > dereference

Re: [PATCH] drm/amdkfd: Fix resource leak in kriu rsetore queue

2024-09-06 Thread Alex Deucher
Typo in patch subject: kriu -> criu Alex On Fri, Sep 6, 2024 at 2:03 AM wrote: > > From: "jesse.zh...@amd.com" > > To avoid memory leaks, release q_extra_data when exiting the restore queue. > > Signed-off-by: Jesse Zhang > --- > drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c | 1 + >

Re: [PATCH] drm/amdgpu: Raise dma resv usage for created TLB fence

2024-09-06 Thread Christian König
Well that's the whole reason I'm asking :) Why do you think it should be added as dependency in amdgpu_vm_sdma_update? As far as I can see that is complete nonsense. Page table updates never depend on TLB flushes, it's the TLB flush which depends on the page table update. Regards, Christian

Re: [PATCH] drm/amdgpu/mes11: Indent an if statment

2024-09-06 Thread Alex Deucher
Applied. Thanks! Alex On Thu, Sep 5, 2024 at 3:08 PM Dan Carpenter wrote: > > Indent the "break" statement one more tab. > > Signed-off-by: Dan Carpenter > --- > drivers/gpu/drm/amd/amdgpu/mes_v11_0.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/amd

Re: [PATCH] drm/amdgpu: Fix missing check pcie_p2p module param

2024-09-06 Thread Alex Deucher
On Fri, Sep 6, 2024 at 5:48 AM Bob Zhou wrote: > > The module param pcie_p2p should be checked for kfd p2p feature, so add it. > > Fixes: a9b55f03989a ("drm/amdgpu: Take IOMMU remapping into account for p2p > checks") > Signed-off-by: Bob Zhou Reviewed-by: Alex Deucher > --- > drivers/gpu/dr

Re: [PATCH v5.15-v5.10] drm/amd/pm: Fix the null pointer dereference for vega10_hwmgr

2024-09-06 Thread Alex Deucher
On Fri, Sep 6, 2024 at 4:50 AM Mukul Sikka wrote: > > On Fri, Sep 6, 2024 at 12:05 AM Alex Deucher wrote: > > > > On Tue, Sep 3, 2024 at 5:53 AM sikkamukul wrote: > > > > > > From: Bob Zhou > > > > > > [ Upstream commit 50151b7f1c79a09117837eb95b76c2de76841dab ] > > > > > > Check return value a

Re: [PATCH 2/2] drm/amd/amdgpu: apply command submission parser for JPEG v1

2024-09-06 Thread Alex Deucher
On Thu, Sep 5, 2024 at 5:40 PM David (Ming Qiang) Wu wrote: > > Similar to jpeg_v2_dec_ring_parse_cs() but it has different > register ranges and a few other registers access. > > Signed-off-by: David (Ming Qiang) Wu Series is: Acked-by: Alex Deucher > --- > drivers/gpu/drm/amd/amdgpu/jpeg_v1

[PATCH] drm/amdgpu/atomfirmware: Silence UBSAN warning

2024-09-06 Thread Alex Deucher
Per the comments, these are variable sized arrays. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3613 Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/include/atomfirmware.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/include/atomfirmw

Re: [PATCH v5 19/44] drm/vkms: add 3x4 matrix in color pipeline

2024-09-06 Thread Harry Wentland
On 2024-08-27 13:49, Louis Chauvet wrote: > Le 19/08/24 - 16:56, Harry Wentland a écrit : >> We add two 3x4 matrices into the VKMS color pipeline. The reason >> we're adding matrices is so that we can test that application >> of a matrix and its inverse yields an output equal to the input >> ima

Re: [PATCH] drm/amdgpu: update suspend status for aborting from deeper suspend

2024-09-06 Thread Deucher, Alexander
[AMD Official Use Only - AMD Internal Distribution Only] Can you elaborate on how this fails? Seems like maybe we should just get rid of adev->suspend_complete and just check the MP0 SOL register to determine whether or not we need to reset the GPU on resume. Alex

[PATCH] Revert "drm/amdgpu: Add flags to distinguish vf/pf/pt mode"

2024-09-06 Thread Alex Deucher
This reverts commit f03b874313cc9b5859596fe9c5b368387b6da771. This is unused so far and has not gone upstream yet, so remove it until the userspace side is ready. Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 3 +-- drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c | 9 -

Re: [PATCH] drm/amdgpu: always allocate cleared VRAM for GEM allocations

2024-09-06 Thread Alex Deucher
On Fri, Sep 6, 2024 at 10:18 AM Marek Olšák wrote: > > Can you also bump the DRM version, so that userspace knows when to > skip its own clear? Sure, although going forward, it might be better to migrate to a generic flags query in the INFO ioctl so we can just check for various feature bits so w

[PATCH] drm/amdgpu: bump driver version for cleared VRAM

2024-09-06 Thread Alex Deucher
Driver now clears VRAM on allocation. Bump the driver version so mesa knows when it will get cleared vram by default. Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_d

[RFC 3/4] drm/sched: Always increment correct scheduler score

2024-09-06 Thread Tvrtko Ursulin
From: Tvrtko Ursulin Entities run queue can change during drm_sched_entity_push_job() so make sure to update the score consistently. Signed-off-by: Tvrtko Ursulin Fixes: d41a39dda140 ("drm/scheduler: improve job distribution with multiple queues") Cc: Nirmoy Das Cc: Christian König Cc: Luben

[RFC 4/4] drm/sched: Optimise drm_sched_entity_push_job

2024-09-06 Thread Tvrtko Ursulin
From: Tvrtko Ursulin In FIFO mode We can avoid dropping the lock only to immediately re-acquire by adding a new drm_sched_rq_update_fifo_locked() helper. Signed-off-by: Tvrtko Ursulin Cc: Christian König Cc: Alex Deucher Cc: Luben Tuikov Cc: Matthew Brost --- drivers/gpu/drm/scheduler/sche

[RFC 2/4] drm/sched: Always wake up correct scheduler in drm_sched_entity_push_job

2024-09-06 Thread Tvrtko Ursulin
From: Tvrtko Ursulin Since drm_sched_entity_modify_sched() can modify the entities run queue lets make sure to only derefernce the pointer once so both adding and waking up are guaranteed to be consistent. Signed-off-by: Tvrtko Ursulin Fixes: b37aced31eb0 ("drm/scheduler: implement a function t

[RFC 0/4] DRM scheduler fixes, or not, or incorrect kind

2024-09-06 Thread Tvrtko Ursulin
From: Tvrtko Ursulin In a recent conversation with Christian there was a thought that drm_sched_entity_modify_sched() should start using the entity->rq_lock to be safe against job submission and simultaneous priority changes. The kerneldoc accompanying that function however is a bit unclear to m

[RFC 1/4] drm/sched: Add locking to drm_sched_entity_modify_sched

2024-09-06 Thread Tvrtko Ursulin
From: Tvrtko Ursulin Without the locking amdgpu currently can race amdgpu_ctx_set_entity_priority() and drm_sched_job_arm(), leading to the latter accesing potentially inconsitent entity->sched_list and entity->num_sched_list pair. The comment on drm_sched_entity_modify_sched() however says: ""

[RFC 0/2] drm/amdgpu: No need for dynamic DRM priority?

2024-09-06 Thread Tvrtko Ursulin
From: Tvrtko Ursulin In a recent conversation with Christian there was a thought that dynamic DRM scheduling priority changes are not required, or even not desired (actively prevented?!), and can be ripped out. For more context, starting point for that conversation was me observing that they (dy

[RFC 2/2] drm/sched: Remove drm_sched_entity_set_priority

2024-09-06 Thread Tvrtko Ursulin
From: Tvrtko Ursulin Now that no callers exist, lets remove the whole misleading helper. Misleading because runtime changes do not reliably work due drm_sched_entity_select_rq() only acting on idle entities. Signed-off-by: Tvrtko Ursulin Cc: Christian König Cc: Alex Deucher Cc: Luben Tuikov

[RFC 1/2] drm/amdgpu: Remove dynamic DRM scheduling priority override

2024-09-06 Thread Tvrtko Ursulin
From: Tvrtko Ursulin According to Christian the dynamic DRM priority override was only interesting before the hardware priority (dona via drm_sched_entity_modify_sched()) existed. Furthermore, both overrides also only work somewhat on paper while in reality they are only effective if the entity i

Re: [PATCH 1/2] Documentation/gpu: Document the situation with unqualified drm-memory-

2024-09-06 Thread Alex Deucher
On Wed, Sep 4, 2024 at 4:36 AM Tvrtko Ursulin wrote: > > > On 21/08/2024 21:47, Alex Deucher wrote: > > On Tue, Aug 13, 2024 at 9:57 AM Tvrtko Ursulin wrote: > >> > >> From: Tvrtko Ursulin > >> > >> Currently it is not well defined what is drm-memory- compared to other > >> categories. > >> > >>

Re: [PATCH] drm/amdgpu: always allocate cleared VRAM for GEM allocations

2024-09-06 Thread Marek Olšák
On Fri, Sep 6, 2024 at 1:53 PM Alex Deucher wrote: > > On Fri, Sep 6, 2024 at 10:18 AM Marek Olšák wrote: > > > > Can you also bump the DRM version, so that userspace knows when to > > skip its own clear? > > Sure, although going forward, it might be better to migrate to a > generic flags query i

Re: 6.11/regression/bisected - after commit 1b04dcca4fb1, launching some RenPy games causes computer hang

2024-09-06 Thread Leo Li
Hi Mikhail, I've tried to align my system with yours as best as I can, but so far, I've had no luck reproducing the hang. A video of what I'm doing: https://youtu.be/VeD-LPCnfWM?si=b2baF8MyDBuU4jRH (Under the hood, the W7900 and 7900xt should be the same) I have a few suggestions: First, can

[pull] amdgpu, amdkfd drm-next-6.12

2024-09-06 Thread Alex Deucher
Hi Dave, Simona, Updates for 6.12. The following changes since commit e55ef65510a401862b902dc979441ea10ae25c61: Merge tag 'amd-drm-next-6.12-2024-08-26' of https://gitlab.freedesktop.org/agd5f/linux into drm-next (2024-08-27 14:33:12 +0200) are available in the Git repository at: https:/

[PATCH] drm/amd/display: Do not reset planes based on crtc zpos_changed

2024-09-06 Thread sunpeng.li
From: Leo Li [Why] drm_normalize_zpos will set the crtc_state->zpos_changed to 1 if any of it's assigned planes changes zpos, or is removed/added from it. To have amdgpu_dm request a plane reset on this is too broad. For example, if only the cursor plane was moved from one crtc to another, the

Re: [PATCH] drm/amd/display: Do not reset planes based on crtc zpos_changed

2024-09-06 Thread Harry Wentland
On 2024-09-06 17:20, sunpeng...@amd.com wrote: > From: Leo Li > > [Why] > > drm_normalize_zpos will set the crtc_state->zpos_changed to 1 if any of > it's assigned planes changes zpos, or is removed/added from it. > > To have amdgpu_dm request a plane reset on this is too broad. For > exampl