RE: [PATCH] drm/amdgpu: perform mode2 reset for sdma fed error on gfx v11_0_3

2023-05-16 Thread Chai, Thomas
[AMD Official Use Only - General] reset_context is a local variable in amdgpu_ras_do_recovery, if gpu_reset_flag is not used, read regRLC_RLCS_FED_STATUS_0 register and check sdma fed error field may move into amdgpu_ras_do_recovery, which may corrupt the code structure of amdgpu_ras.c. amdgpu

RE: [PATCH] drm/amdgpu: fix incorrect pcie_gen_mask in passthrough case

2023-05-16 Thread Chen, Horace
[AMD Official Use Only - General] Hi Alex, Can you help review this patch? Currently on passthrough, GPU is also on the root bus but it is not APU. Current driver regard it as APU and limit the PCIE link speed to gen2. It causes some failure on some OCL benchmark. Thanks & Regards, Horace. --

Re: [PATCH v2 0/3] Fix DCN 3.1.4 hangs on s2idle entry

2023-05-16 Thread Limonciello, Mario
I think we replaced this with golden timestamp value which doesn't require GFX register access. Ah yes; through 5591a051b86b ("drm/amdgpu: refine get gpu clock counter method") This wasn't part of the kernel this was originally reported on. I suspect this would significantly decrease the l

Re: [PATCH v2 0/3] Fix DCN 3.1.4 hangs on s2idle entry

2023-05-16 Thread Limonciello, Mario
On 5/17/2023 12:26 AM, Lazar, Lijo wrote: On 5/17/2023 10:46 AM, Limonciello, Mario wrote: On 5/17/2023 12:07 AM, Lazar, Lijo wrote: On 5/17/2023 10:25 AM, Limonciello, Mario wrote: On 5/16/2023 11:43 PM, Lazar, Lijo wrote: On 5/17/2023 5:04 AM, Mario Limonciello wrote: DCN 3.1.4 s

Re: [PATCH v2 0/3] Fix DCN 3.1.4 hangs on s2idle entry

2023-05-16 Thread Lazar, Lijo
On 5/17/2023 10:46 AM, Limonciello, Mario wrote: On 5/17/2023 12:07 AM, Lazar, Lijo wrote: On 5/17/2023 10:25 AM, Limonciello, Mario wrote: On 5/16/2023 11:43 PM, Lazar, Lijo wrote: On 5/17/2023 5:04 AM, Mario Limonciello wrote: DCN 3.1.4 s2idle entry will hang occasionally on s2idle

Re: [PATCH v2 0/3] Fix DCN 3.1.4 hangs on s2idle entry

2023-05-16 Thread Limonciello, Mario
On 5/17/2023 12:07 AM, Lazar, Lijo wrote: On 5/17/2023 10:25 AM, Limonciello, Mario wrote: On 5/16/2023 11:43 PM, Lazar, Lijo wrote: On 5/17/2023 5:04 AM, Mario Limonciello wrote: DCN 3.1.4 s2idle entry will hang occasionally on s2idle entry, but only if running Wayland and only when us

Re: [PATCH v2 0/3] Fix DCN 3.1.4 hangs on s2idle entry

2023-05-16 Thread Lazar, Lijo
On 5/17/2023 10:25 AM, Limonciello, Mario wrote: On 5/16/2023 11:43 PM, Lazar, Lijo wrote: On 5/17/2023 5:04 AM, Mario Limonciello wrote: DCN 3.1.4 s2idle entry will hang occasionally on s2idle entry, but only if running Wayland and only when using `systemctl suspend`, not `echo mem | tee

Re: [PATCH v2 0/3] Fix DCN 3.1.4 hangs on s2idle entry

2023-05-16 Thread Limonciello, Mario
On 5/16/2023 11:43 PM, Lazar, Lijo wrote: On 5/17/2023 5:04 AM, Mario Limonciello wrote: DCN 3.1.4 s2idle entry will hang occasionally on s2idle entry, but only if running Wayland and only when using `systemctl suspend`, not `echo mem | tee /sys/power/state`. This happens because using `syst

Re: [PATCH v2 0/3] Fix DCN 3.1.4 hangs on s2idle entry

2023-05-16 Thread Lazar, Lijo
On 5/17/2023 5:04 AM, Mario Limonciello wrote: DCN 3.1.4 s2idle entry will hang occasionally on s2idle entry, but only if running Wayland and only when using `systemctl suspend`, not `echo mem | tee /sys/power/state`. This happens because using `systemctl suspend` will cause the screen to loc

[PATCH v2 1/3] drm/amd: Flush any delayed gfxoff on suspend entry

2023-05-16 Thread Mario Limonciello
DCN 3.1.4 is reported to hang on s2idle entry if graphics activity is happening during entry. This is because GFXOFF was scheduled as delayed but RLC gets disabled in s2idle entry sequence which will hang GFX IP if not already in GFXOFF. To help this problem, flush any delayed work for GFXOFF ear

[PATCH v2 3/3] drm/amd: Skip RLC suspend for s0ix on PSP 13.0.4 and 13.0.11

2023-05-16 Thread Mario Limonciello
RLC suspend in s0ix is unncessary as the SMU and IMU jointly manages graphics power state. Suggested-by: Alexander Deucher Signed-off-by: Mario Limonciello --- v1->v2: * Skip RLC all the time instead of adding safety to it --- drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c | 4 ++-- 1 file changed,

[PATCH v2 0/3] Fix DCN 3.1.4 hangs on s2idle entry

2023-05-16 Thread Mario Limonciello
DCN 3.1.4 s2idle entry will hang occasionally on s2idle entry, but only if running Wayland and only when using `systemctl suspend`, not `echo mem | tee /sys/power/state`. This happens because using `systemctl suspend` will cause the screen to lock right before writing mem into /sys/power/state. T

[PATCH v2 2/3] drm/amd: Poll for GFX core to be off

2023-05-16 Thread Mario Limonciello
If GFXOFF was flushed during suspend entry it may take some time for GFX core to be powered down. Ensure that it's powered off before continuing any operations that may try to utilize related IP. This avoids hangs from stopping RLC as well as problems with fence interrupts timing out during s2idle

RE: [PATCH 2/3] drm/amd: Poll for GFX core to be off

2023-05-16 Thread Huang, Tim
[AMD Official Use Only - General] Hi Lijo, Yes, the GFX_IMU_MSG_FLAGS is outside of GFXOFF domain. It can be accessed when GFXOFF is entry. Best Regards, Tim Huang -Original Message- From: Lazar, Lijo Sent: Wednesday, May 17, 2023 10:48 AM To: Limonciello, Mario ; amd-gfx@lists.fre

RE: [PATCH] drm/amdgpu: perform mode2 reset for sdma fed error on gfx v11_0_3

2023-05-16 Thread Zhang, Hawking
[AMD Official Use Only - General] Shall we just force the mode-2 reset if it is non-fatal error mode? Is the gpu_reset_flag really necessary in such case? reset_context.method = AMD_RESET_METHOD_MODE2; Ideally, driver decides either perform reset or other error handling approach (i.e. unmap qu

RE: [PATCH 2/3] drm/amd: Poll for GFX core to be off

2023-05-16 Thread Lazar, Lijo
[AMD Official Use Only - General] Is this register GFX_IMU_MSG_FLAGS outside of GFXOFF domain? Thanks, Lijo -Original Message- From: amd-gfx On Behalf Of Mario Limonciello Sent: Tuesday, May 16, 2023 11:22 PM To: amd-gfx@lists.freedesktop.org Cc: Tsao, Anson ; Huang, Tim ; Martinez, J

Re: [PATCH 3/3] drm/amd: Add safety check to make sure RLC is only turned off while in GFXOFF

2023-05-16 Thread Limonciello, Mario
On 5/16/2023 4:57 PM, Alex Deucher wrote: On Tue, May 16, 2023 at 5:50 PM Limonciello, Mario wrote: On 5/16/2023 4:39 PM, Alex Deucher wrote: On Tue, May 16, 2023 at 2:15 PM Mario Limonciello wrote: On GFX11 if RLC is stopped when not in GFXOFF the system will hang. Prevent this case from

[PATCH] drm/amdgpu: perform mode2 reset for sdma fed error on gfx v11_0_3

2023-05-16 Thread YiPeng Chai
perform mode2 reset for sdma fed error on gfx v11_0_3. Signed-off-by: YiPeng Chai --- drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 8 +++- drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h | 5 + drivers/gpu/drm/amd/amdgpu/gfx_v11_0_3.c | 14 +- 3 files changed, 25 insertions(+), 2 de

Re: [v2,11/12] drm/fbdev-generic: Implement dedicated fbdev I/O helpers

2023-05-16 Thread Sui Jingfeng
Hi, Thomas After apply your patch set, the kernel with arch/loongarch/configs/loongson3_defconfig can not finish compile anymore.  gcc complains:   AR  drivers/gpu/built-in.a   AR  drivers/built-in.a   AR  built-in.a   AR  vmlinux.a   LD  vmlinux.o   OBJCOPY modules.buil

RE: [PATCH 1/2] drm/amdgpu/vcn4: fix endian conversion

2023-05-16 Thread Chen, Guchun
Acked-by: Guchun Chen for this series. Regards, Guchun > -Original Message- > From: amd-gfx On Behalf Of Alex > Deucher > Sent: Wednesday, May 17, 2023 5:18 AM > To: amd-gfx@lists.freedesktop.org > Cc: Deucher, Alexander > Subject: [PATCH 1/2] drm/amdgpu/vcn4: fix endian conversion >

RE: [PATCH] drm/amd/pm: add delay to avoid unintened shutdown due to hotspot temperature spark

2023-05-16 Thread Feng, Kenneth
[AMD Official Use Only - General] Do we really need this delay on all the ASICs? Maybe set the default value to 0 is more reasonable? Thanks. -Original Message- From: amd-gfx On Behalf Of Evan Quan Sent: Tuesday, May 16, 2023 10:51 AM To: amd-gfx@lists.freedesktop.org Cc: Deucher, Alexa

Re: [PATCH 3/3] drm/amd: Add safety check to make sure RLC is only turned off while in GFXOFF

2023-05-16 Thread Alex Deucher
On Tue, May 16, 2023 at 5:50 PM Limonciello, Mario wrote: > > > On 5/16/2023 4:39 PM, Alex Deucher wrote: > > On Tue, May 16, 2023 at 2:15 PM Mario Limonciello > > wrote: > >> On GFX11 if RLC is stopped when not in GFXOFF the system will hang. > >> Prevent this case from ever happening. > >> > >>

Re: [PATCH 3/3] drm/amd: Add safety check to make sure RLC is only turned off while in GFXOFF

2023-05-16 Thread Limonciello, Mario
On 5/16/2023 4:39 PM, Alex Deucher wrote: On Tue, May 16, 2023 at 2:15 PM Mario Limonciello wrote: On GFX11 if RLC is stopped when not in GFXOFF the system will hang. Prevent this case from ever happening. Tested-by: Juan Martinez Signed-off-by: Mario Limonciello --- drivers/gpu/drm/amd/

Re: [PATCH 3/3] drm/amd: Add safety check to make sure RLC is only turned off while in GFXOFF

2023-05-16 Thread Alex Deucher
On Tue, May 16, 2023 at 2:15 PM Mario Limonciello wrote: > > On GFX11 if RLC is stopped when not in GFXOFF the system will hang. > Prevent this case from ever happening. > > Tested-by: Juan Martinez > Signed-off-by: Mario Limonciello > --- > drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c | 4 > 1

[PATCH 2/2] drm/amdgpu/gmc9: fix 64 bit division in partition code

2023-05-16 Thread Alex Deucher
Rework logic or use do_div() to avoid problems on 32 bit. Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 5 - drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 11 ++- 2 files changed, 10 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgp

[PATCH 1/2] drm/amdgpu/vcn4: fix endian conversion

2023-05-16 Thread Alex Deucher
sq.is_enabled is a byte so there is no need to endian swap it. Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/vcn_v4_0_3.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v4_0_3.c b/drivers/gpu/drm/amd/amdgpu/vcn_v4_0_3.c index c77c

[PATCH 2/3] drm/amd: Poll for GFX core to be off

2023-05-16 Thread Mario Limonciello
If GFXOFF was flushed during suspend entry it may take some time for GFX core to be powered down. Ensure that it's powered off before continuing any operations that may try to utilize related IP. This avoids hangs from stopping RLC as well as problems with fence interrupts timing out during s2idle

[PATCH 3/3] drm/amd: Add safety check to make sure RLC is only turned off while in GFXOFF

2023-05-16 Thread Mario Limonciello
On GFX11 if RLC is stopped when not in GFXOFF the system will hang. Prevent this case from ever happening. Tested-by: Juan Martinez Signed-off-by: Mario Limonciello --- drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c | 4 1 file changed, 4 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_

[PATCH 1/3] drm/amd: Flush any delayed gfxoff on suspend entry

2023-05-16 Thread Mario Limonciello
DCN 3.1.4 is reported to hang on s2idle entry if graphics activity is happening during entry. This is because GFXOFF was scheduled as delayed but RLC gets disabled in s2idle entry sequence which will hang GFX IP if not already in GFXOFF. To help this problem, flush any delayed work for GFXOFF ear

[PATCH 0/3] Fix DCN 3.1.4 hangs on s2idle entry

2023-05-16 Thread Mario Limonciello
It's been observed that with DCN 3.1.4 s2idle entry will hang occasionally on s2idle entry, but only if running Wayland and only when using `systemctl suspend`, not `echo mem | tee /sys/power/state`. This happens because using `systemctl suspend` will cause the screen to lock right before writing

[linux-next:master] BUILD SUCCESS WITH WARNING 885df05bf634d589fbf030c3751614eaa453fb5d

2023-05-16 Thread kernel test robot
tree/branch: INFO setup_repo_specs: /db/releases/20230516180935/lkp-src/repo/*/linux-next https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git master branch HEAD: 885df05bf634d589fbf030c3751614eaa453fb5d Add linux-next specific files for 20230516 Warning reports: https

Re: [PATCH v5 1/6] mm/gup: remove unused vmas parameter from get_user_pages()

2023-05-16 Thread John Hubbard
On 5/16/23 07:35, David Hildenbrand wrote: ... >>> When passing NULL as "pages" to get_user_pages(), __get_user_pages_locked() >>> won't set FOLL_GET. As FOLL_PIN is also not set, we won't be messing with >>> the mapcount of the page. > > For completeness: s/mapcount/refcount/ :) whew, you had me

Re: [PATCH 2/5] drm/amd/display: Move three variable assignments behind condition checks in trigger_hotplug()

2023-05-16 Thread Markus Elfring
>> The address of a data structure member was determined before >> a corresponding null pointer check in the implementation of >> the function “trigger_hotplug”. >> >> Thus avoid the risk for undefined behaviour by moving the assignment >> for three local variables behind some condition checks. > >

Re: [PATCH v5 1/6] mm/gup: remove unused vmas parameter from get_user_pages()

2023-05-16 Thread David Hildenbrand
On 16.05.23 16:30, Sean Christopherson wrote: On Tue, May 16, 2023, David Hildenbrand wrote: On 15.05.23 21:07, Sean Christopherson wrote: On Sun, May 14, 2023, Lorenzo Stoakes wrote: diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index cb5c13eee193..eaa5bb8dbadc 100644 --- a/virt/kvm/

Re: [PATCH v5 1/6] mm/gup: remove unused vmas parameter from get_user_pages()

2023-05-16 Thread Sean Christopherson
On Tue, May 16, 2023, David Hildenbrand wrote: > On 15.05.23 21:07, Sean Christopherson wrote: > > On Sun, May 14, 2023, Lorenzo Stoakes wrote: > > > diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c > > > index cb5c13eee193..eaa5bb8dbadc 100644 > > > --- a/virt/kvm/kvm_main.c > > > +++ b/virt

Re: [PATCH] drm:amd:amdgpu: Fix missing buffer object unlock in failure path

2023-05-16 Thread Alex Deucher
Applied. Thanks! Alex On Mon, May 15, 2023 at 6:27 PM Sukrut Bellary wrote: > > > On 5/3/23 16:15, Sukrut Bellary wrote: > > smatch warning - > > 1) drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c:3615 gfx_v9_0_kiq_resume() > > warn: inconsistent returns 'ring->mqd_obj->tbo.base.resv'. > > > > 2) drivers

Re: [PATCH v2 03/12] drm/exynos: Use regular fbdev I/O helpers

2023-05-16 Thread Thomas Zimmermann
Hi Sam Am 15.05.23 um 19:43 schrieb Sam Ravnborg: Hi Thomas, On Mon, May 15, 2023 at 11:40:24AM +0200, Thomas Zimmermann wrote: Use the regular fbdev helpers for framebuffer I/O instead of DRM's helpers. Exynos does not use damage handling, so DRM's fbdev helpers are mere wrappers around the f

Re: [PATCH v5 1/6] mm/gup: remove unused vmas parameter from get_user_pages()

2023-05-16 Thread David Hildenbrand
On 15.05.23 21:07, Sean Christopherson wrote: On Sun, May 14, 2023, Lorenzo Stoakes wrote: No invocation of get_user_pages() use the vmas parameter, so remove it. The GUP API is confusing and caveated. Recent changes have done much to improve that, however there is more we can do. Exporting vma

Re: [PATCH v2] drm/dp_mst: Clear MSG_RDY flag before sending new message

2023-05-16 Thread Jani Nikula
On Thu, 27 Apr 2023, Wayne Lin wrote: > [Why] > The sequence for collecting down_reply from source perspective should > be: > > Request_n->repeat (get partial reply of Request_n->clear message ready > flag to ack DPRX that the message is received) till all partial > replies for Request_n are recei

Re: [PATCH v2 02/12] drm/armada: Use regular fbdev I/O helpers

2023-05-16 Thread Thomas Zimmermann
Hi Am 15.05.23 um 20:04 schrieb Russell King (Oracle): On Mon, May 15, 2023 at 07:55:44PM +0200, Sam Ravnborg wrote: Hi Thomas, On Mon, May 15, 2023 at 11:40:23AM +0200, Thomas Zimmermann wrote: Use the regular fbdev helpers for framebuffer I/O instead of DRM's helpers. Armada does not use da

Re: [PATCH] drm/amd/pm: add delay to avoid unintened shutdown due to hotspot temperature spark

2023-05-16 Thread Alex Deucher
On Mon, May 15, 2023 at 10:52 PM Evan Quan wrote: > > There will be a double check for the hotspot temperature on delay > expired. This can avoid unintended shutdown due to hotspot temperature > spark. > > Signed-off-by: Evan Quan > -- > v1->v2: > - add the proper millidegree Celsius to degree

Re: [PATCH v2 02/12] drm/armada: Use regular fbdev I/O helpers

2023-05-16 Thread Thomas Zimmermann
Hi Am 15.05.23 um 19:55 schrieb Sam Ravnborg: Hi Thomas, On Mon, May 15, 2023 at 11:40:23AM +0200, Thomas Zimmermann wrote: Use the regular fbdev helpers for framebuffer I/O instead of DRM's helpers. Armada does not use damage handling, so DRM's fbdev helpers are mere wrappers around the fbdev

[PATCH] drm/amdgpu: fix incorrect pcie_gen_mask in passthrough case

2023-05-16 Thread Tong Liu01
[why] Passthrough case is treated as root bus and pcie_gen_mask is set as default value that does not support GEN 3 and GEN 4 for PCIe link speed. So PCIe link speed will be downgraded at smu hw init in passthrough condition [how] Move detect virtualization before get pcie info and check if it is

RE: [PATCH v2] drm/dp_mst: Clear MSG_RDY flag before sending new message

2023-05-16 Thread Lin, Wayne
[Public] Hi, Ping again for code review. Much appreciated! Regards, Wayne > -Original Message- > From: Lin, Wayne > Sent: Monday, May 8, 2023 5:49 PM > To: ly...@redhat.com; jani.nik...@intel.com; dri- > de...@lists.freedesktop.org; amd-gfx@lists.freedesktop.org > Cc: ville.syrj...@linu

Re: [PATCH v4 5/9] drivers: use new capable_any functionality

2023-05-16 Thread Alexander Gordeev
On Thu, May 11, 2023 at 04:25:28PM +0200, Christian Göttsche wrote: > Use the new added capable_any function in appropriate cases, where a > task is required to have any of two capabilities. > > Reorder CAP_SYS_ADMIN last. > > Signed-off-by: Christian Göttsche > --- > v4: >Additional usage i