Re: amdgpu driver halted on suspend of shutdown

2021-09-29 Thread Christian König
Well you could remove it locally if it solves your problem at hand. But keep in mind that a lot of ARM boards are simply not compliant to the PCIe specification and the hardware won't work correctly on those in general. I'm pretty sure you have one of those cases here. Christian. Am 30.09.2

Re: [PATCH] Documentation/gpu: remove spurious "+" in amdgpu.rst

2021-09-29 Thread Christian König
Am 29.09.21 um 19:45 schrieb Alex Deucher: Not sure why that was there. Remove it. Signed-off-by: Alex Deucher Reviewed-by: Christian König --- Documentation/gpu/amdgpu.rst | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/Documentation/gpu/amdgpu.rst b/Documentat

Re: [PATCH] drm/amdgpu: consolidate case statements

2021-09-29 Thread Christian König
Am 29.09.21 um 19:45 schrieb Alex Deucher: IP_VERSION(11, 0, 13) does the exact same thing as IP_VERSION(11, 0, 12) so squash them together. Signed-off-by: Alex Deucher Reviewed-by: Christian König --- drivers/gpu/drm/amd/amdgpu/psp_v11_0.c | 7 --- 1 file changed, 7 deletions(-) d

RE: [PATCH] drm/amdkfd: fix a potential cu_mask memory leak

2021-09-29 Thread Yu, Lang
>-Original Message- >From: Kuehling, Felix >Sent: Thursday, September 30, 2021 11:26 AM >To: Yu, Lang ; amd-gfx@lists.freedesktop.org >Cc: Deucher, Alexander ; Huang, Ray > >Subject: Re: [PATCH] drm/amdkfd: fix a potential cu_mask memory leak > >On 2021-09-29 10:38 p.m., Yu, Lang wrote:

Re: [PATCH 2/4] amdgpu_ucode: reduce number of pr_debug calls

2021-09-29 Thread jim . cromie
On Wed, Sep 29, 2021 at 8:08 PM Joe Perches wrote: > > On Wed, 2021-09-29 at 19:44 -0600, Jim Cromie wrote: > > There are blocks of DRM_DEBUG calls, consolidate their args into > > single calls. With dynamic-debug in use, each callsite consumes 56 > > bytes of callsite data, and this patch remove

Re: [PATCH] drm/amdkfd: fix a potential cu_mask memory leak

2021-09-29 Thread Felix Kuehling
On 2021-09-29 10:38 p.m., Yu, Lang wrote: -Original Message- From: Kuehling, Felix Sent: Thursday, September 30, 2021 10:28 AM To: Yu, Lang ; amd-gfx@lists.freedesktop.org Cc: Deucher, Alexander ; Huang, Ray Subject: Re: [PATCH] drm/amdkfd: fix a potential cu_mask memory leak On 2021-0

RE: [PATCH] drm/amdkfd: fix a potential cu_mask memory leak

2021-09-29 Thread Yu, Lang
>-Original Message- >From: Kuehling, Felix >Sent: Thursday, September 30, 2021 10:28 AM >To: Yu, Lang ; amd-gfx@lists.freedesktop.org >Cc: Deucher, Alexander ; Huang, Ray > >Subject: Re: [PATCH] drm/amdkfd: fix a potential cu_mask memory leak > >On 2021-09-29 10:23 p.m., Yu, Lang wrote:

[pull] amdgpu drm-fixes-5.15

2021-09-29 Thread Alex Deucher
Hi Dave, Daniel, Fixes for 5.15. The following changes since commit 05812b971c6d605c00987750f422918589aa4486: Merge tag 'drm/tegra/for-5.15-rc3' of ssh://git.freedesktop.org/git/tegra/linux into drm-fixes (2021-09-28 17:08:44 +1000) are available in the Git repository at: https://gitlab.

Re: [PATCH] drm/amdkfd: fix a potential cu_mask memory leak

2021-09-29 Thread Felix Kuehling
On 2021-09-29 10:23 p.m., Yu, Lang wrote: -Original Message- From: Kuehling, Felix Sent: Thursday, September 30, 2021 9:47 AM To: Yu, Lang ; amd-gfx@lists.freedesktop.org Cc: Deucher, Alexander ; Huang, Ray Subject: Re: [PATCH] drm/amdkfd: fix a potential cu_mask memory leak On 2021-09

RE: [PATCH] drm/amdkfd: fix a potential cu_mask memory leak

2021-09-29 Thread Yu, Lang
>-Original Message- >From: Kuehling, Felix >Sent: Thursday, September 30, 2021 9:47 AM >To: Yu, Lang ; amd-gfx@lists.freedesktop.org >Cc: Deucher, Alexander ; Huang, Ray > >Subject: Re: [PATCH] drm/amdkfd: fix a potential cu_mask memory leak > >On 2021-09-29 7:32 p.m., Yu, Lang wrote: >>

Re: [PATCH 2/4] amdgpu_ucode: reduce number of pr_debug calls

2021-09-29 Thread Joe Perches
On Wed, 2021-09-29 at 19:44 -0600, Jim Cromie wrote: > There are blocks of DRM_DEBUG calls, consolidate their args into > single calls. With dynamic-debug in use, each callsite consumes 56 > bytes of callsite data, and this patch removes about 65 calls, so > it saves ~3.5kb. > > no functional cha

Re: [PATCH] drm/amdkfd: avoid conflicting address mappings

2021-09-29 Thread Felix Kuehling
On 2021-09-29 7:35 p.m., Mike Lothian wrote: Hi This patch is causing a compile failure for me drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_chardev.c:1254:25: error: unused variable 'svms' [-Werror,-Wunused-variable] struct svm_range_list *svms = &p->svms; ^ 1

Re: amdgpu driver halted on suspend of shutdown

2021-09-29 Thread 李真能
So, Can I remove suspend process in amdgpu_pci_shutdown if  I don't  use amdgpu driver in vm? Thank you so much foryour reply! 在 2021/9/30 上午5:12, Alex Deucher 写道: On Wed, Sep 29, 2021 at 3:25 AM 李真能 wrote: Hello: When I do loop auto test of reboot, I found kernel may halt on me

Re: [PATCH] drm/amdkfd: fix a potential cu_mask memory leak

2021-09-29 Thread Felix Kuehling
On 2021-09-29 7:32 p.m., Yu, Lang wrote: [AMD Official Use Only] -Original Message- From: Kuehling, Felix Sent: Wednesday, September 29, 2021 11:25 PM To: Yu, Lang ; amd-gfx@lists.freedesktop.org Cc: Deucher, Alexander ; Huang, Ray Subject: Re: [PATCH] drm/amdkfd: fix a potential cu

[PATCH 4/4] i915/gvt: remove spaces in pr_debug "gvt: core:" etc prefixes

2021-09-29 Thread Jim Cromie
Taking embedded spaces out of existing prefixes makes them better class-prefixes; simplifying the extra quoting needed otherwise: $> echo format "^gvt: core:" +p >control vs $> echo format ^gvt:core: +p >control Dropping the internal spaces means that quotes are only needed when the trailing

[PATCH 3/4] nouveau: fold multiple DRM_DEBUG_DRIVERs together

2021-09-29 Thread Jim Cromie
With DRM_USE_DYNAMIC_DEBUG, each callsite record requires 56 bytes. We can combine 12 into one here and save ~620 bytes. Signed-off-by: Jim Cromie --- drivers/gpu/drm/nouveau/nouveau_drm.c | 36 +-- 1 file changed, 23 insertions(+), 13 deletions(-) diff --git a/drivers/g

[PATCH 2/4] amdgpu_ucode: reduce number of pr_debug calls

2021-09-29 Thread Jim Cromie
There are blocks of DRM_DEBUG calls, consolidate their args into single calls. With dynamic-debug in use, each callsite consumes 56 bytes of callsite data, and this patch removes about 65 calls, so it saves ~3.5kb. no functional changes. RFC: this creates multi-line log messages, does that break

[PATCH 1/4] drm: fix doc grammar error

2021-09-29 Thread Jim Cromie
no code changes, good for rc Signed-off-by: Jim Cromie --- include/drm/drm_drv.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/include/drm/drm_drv.h b/include/drm/drm_drv.h index 0cd95953cdf5..4b29261c4537 100644 --- a/include/drm/drm_drv.h +++ b/include/drm/drm_drv.h @@ -4

[PATCH 0/4] drm: maintenance patches for 5.15-rcX

2021-09-29 Thread Jim Cromie
hi drm folks, Heres a small set of assorted patches which are IMO suitable for rcX; one doc fix, 2 patches folding multiple DBGs together, and a format string modification. Jim Cromie (4): drm: fix doc grammar error amdgpu_ucode: reduce number of pr_debug calls nouveau: fold multiple DRM_DE

Re: [PATCH] drm/amdkfd: avoid conflicting address mappings

2021-09-29 Thread Mike Lothian
Hi This patch is causing a compile failure for me drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_chardev.c:1254:25: error: unused variable 'svms' [-Werror,-Wunused-variable] struct svm_range_list *svms = &p->svms; ^ 1 error generated. I'll turn off Werror On Mon,

RE: [PATCH] drm/amdkfd: fix a potential cu_mask memory leak

2021-09-29 Thread Yu, Lang
[AMD Official Use Only] >-Original Message- >From: Kuehling, Felix >Sent: Wednesday, September 29, 2021 11:25 PM >To: Yu, Lang ; amd-gfx@lists.freedesktop.org >Cc: Deucher, Alexander ; Huang, Ray > >Subject: Re: [PATCH] drm/amdkfd: fix a potential cu_mask memory leak > >Am 2021-09-29 um

Re: amdgpu driver halted on suspend of shutdown

2021-09-29 Thread Alex Deucher
On Wed, Sep 29, 2021 at 3:25 AM 李真能 wrote: > > Hello: > > When I do loop auto test of reboot, I found kernel may halt > on memcpy_fromio of amdgpu's amdgpu_uvd_suspend, so I remove suspend > process in amdgpu_pci_shutdown, and it will fix this bug. > > I have 3 questions to ask: > > 1.

Re: [PATCH] drm/amdgpu/display: protect DCN specific stuff in process_deferred_updates

2021-09-29 Thread Harry Wentland
On 2021-09-29 16:36, Alex Deucher wrote: > Need to protect this function with CONFIG_DRM_AMD_DC_DCN. > > Fixes: bfd34644dedb ("drm/amd/display: Defer LUT memory powerdown until LUT > bypass latches") > Cc: Michael Strauss > Cc: Eric Yang > Cc: Anson Jacob > Reported-by: Stephen Rothwell > Sig

[PATCH] drm/amdgpu/display: protect DCN specific stuff in process_deferred_updates

2021-09-29 Thread Alex Deucher
Need to protect this function with CONFIG_DRM_AMD_DC_DCN. Fixes: bfd34644dedb ("drm/amd/display: Defer LUT memory powerdown until LUT bypass latches") Cc: Michael Strauss Cc: Eric Yang Cc: Anson Jacob Reported-by: Stephen Rothwell Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/display/

[PATCH 2/2] amd/amdgpu_dm: Verify Gamma and Degamma LUT sizes using DRM Core check

2021-09-29 Thread Mark Yacoub
From: Mark Yacoub [Why] drm_atomic_helper_check_crtc now verifies both legacy and non-legacy LUT sizes. There is no need to check it within amdgpu_dm_atomic_check. [How] Remove the local call to verify LUT sizes and use DRM Core function instead. Tested on ChromeOS Zork. Signed-off-by: Mark Ya

[PATCH 1/2] drm: Add Gamma and Degamma LUT sizes props to drm_crtc to validate.

2021-09-29 Thread Mark Yacoub
From: Mark Yacoub [Why] 1. drm_atomic_helper_check doesn't check for the LUT sizes of either Gamma or Degamma props in the new CRTC state, allowing any invalid size to be passed on. 2. Each driver has its own LUT size, which could also be different for legacy users. [How] 1. Create |degamma_lut_

[PATCH 2/2] drm/amdgpu/jpeg: add jpeg2.6 start/end

2021-09-29 Thread James Zhu
Add jpeg2.6 with updated PCTL0_MMHUB_DEEPSLEEP_IB address in start/end. Signed-off-by: James Zhu --- drivers/gpu/drm/amd/amdgpu/jpeg_v2_5.c | 40 -- 1 file changed, 38 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/jpeg_v2_5.c b/drivers/gp

[PATCH 1/2] drm/amdgpu/jpeg2: move jpeg2 shared macro to header file

2021-09-29 Thread James Zhu
Move jpeg2 shared macro to header file Signed-off-by: James Zhu --- drivers/gpu/drm/amd/amdgpu/jpeg_v2_0.c | 20 drivers/gpu/drm/amd/amdgpu/jpeg_v2_0.h | 20 2 files changed, 20 insertions(+), 20 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/jpeg

[PATCH v3 2/2] amd/display: only require overlay plane to cover whole CRTC on ChromeOS

2021-09-29 Thread Simon Ser
Commit ddab8bd788f5 ("drm/amd/display: Fix two cursor duplication when using overlay") changed the atomic validation code to forbid the overlay plane from being used if it doesn't cover the whole CRTC. The motivation is that ChromeOS uses the atomic API for everything except the cursor plane (which

[PATCH v3 1/2] amd/display: check cursor plane matches underlying plane

2021-09-29 Thread Simon Ser
The current logic checks whether the cursor plane blending properties match the primary plane's. However that's wrong, because the cursor is painted on all planes underneath. If the cursor is over the primary plane and the cursor plane, it's painted on both pipes. Iterate over the CRTC planes and

RE: [PATCH] drm/amdgpu/display: remove unused variable

2021-09-29 Thread Ma, Hanghong
[AMD Official Use Only] Hi Alex, This looks good to me, and thanks for the clean up. Reviewed-by: Leo (Hanghong) Ma -Leo -Original Message- From: amd-gfx On Behalf Of Alex Deucher Sent: Wednesday, September 29, 2021 1:47 PM To: Deucher, Alexander Cc: amd-gfx list Subject: Re: [PATCH]

Re: [PATCH] drm/amdgpu/display: remove unused variable

2021-09-29 Thread Alex Deucher
Ping? On Mon, Sep 27, 2021 at 3:08 PM Alex Deucher wrote: > > No longer used, drop it. > > Fixes: 1e07005161fc ("drm/amd/display: add function to convert hw to dpcd > lane settings") > Signed-off-by: Alex Deucher > --- > drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c | 2 -- > 1 file changed

[PATCH] Documentation/gpu: remove spurious "+" in amdgpu.rst

2021-09-29 Thread Alex Deucher
Not sure why that was there. Remove it. Signed-off-by: Alex Deucher --- Documentation/gpu/amdgpu.rst | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/Documentation/gpu/amdgpu.rst b/Documentation/gpu/amdgpu.rst index 364680cdad2e..8ba72e898099 100644 --- a/Documentation/gp

[PATCH] drm/amdgpu: consolidate case statements

2021-09-29 Thread Alex Deucher
IP_VERSION(11, 0, 13) does the exact same thing as IP_VERSION(11, 0, 12) so squash them together. Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/psp_v11_0.c | 7 --- 1 file changed, 7 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c b/drivers/gpu/drm/amd/amdgpu/

Re: [PATCH] drm/amdkfd: fix a potential cu_mask memory leak

2021-09-29 Thread Felix Kuehling
Am 2021-09-29 um 4:22 a.m. schrieb Lang Yu: > If user doesn't explicitly call kfd_ioctl_destroy_queue > to destroy all created queues, when the kfd process is > destroyed, some queues' cu_mask memory are not freed. > > To avoid forgetting to free them in some places, > free them immediately after u

Re: [PATCH] drm/amdkfd: fix a potential ttm->sg memory leak

2021-09-29 Thread Felix Kuehling
Am 2021-09-29 um 4:19 a.m. schrieb Lang Yu: > Memory is allocated for ttm->sg by kmalloc in kfd_mem_dmamap_userptr, > but isn't freed by kfree in kfd_mem_dmaunmap_userptr. Free it! > > Signed-off-by: Lang Yu Please add Fixes: 264fb4d332f5 ("drm/amdgpu: Add multi-GPU DMA mapping helpers") Review

Re: [PATCH] drm/amd/amdgpu: Do irq_fini_hw after ip_fini_early

2021-09-29 Thread Andrey Grodzovsky
Can you test  this change with hotunplug tests in libdrm ? Since the tests are still in disabled mode until latest fixes propagate to drm-next upstream you will need to comment out https://gitlab.freedesktop.org/mesa/drm/-/blob/main/tests/amdgpu/hotunplug_tests.c#L65 I recently fixed a few regres

Re: [PATCH v2] drm/amd/display: Only define DP 2.0 symbols if not already defined

2021-09-29 Thread Harry Wentland
On 2021-09-28 23:58, Navare, Manasi D wrote: > [AMD Official Use Only] > > We have merged such DRM definition dependencies previously through a topic > branch in order to avoid redefining inside the driver. > But yes guarding this with ifdef is good. > > Reviewed-by: Manasi Navare > Ah, I

Re: [PATCH] drm/amd/amdgpu: Do irq_fini_hw after ip_fini_early

2021-09-29 Thread Alex Deucher
On Wed, Sep 29, 2021 at 5:22 AM YuBiao Wang wrote: > > Some IP such as SMU need irq_put to perform hw_fini. > So move irq_fini_hw after ip_fini. > > Signed-off-by: YuBiao Wang This looks correct in general, but will this code: if (!amdgpu_device_has_dc_support(adev))

Re: [PATCH 2/2] drm/amdgpu: init iommu after amdkfd device init

2021-09-29 Thread Zhu, James
[AMD Official Use Only] H Felix, Since the previous patch can help on PCO suspend/resume hung issue. Let me work with YiFan to see if there is proper way to cover both cases. Thanks & Best Regards! James Zhu From: Kuehling, Felix Sent: Tuesday, September 28

[PATCH v3 11/16] drm/amdkfd: CRIU restore queue doorbell id

2021-09-29 Thread David Yat Sin
When re-creating queues during CRIU restore, restore the queue with the same doorbell id value used during CRIU dump. Signed-off-by: David Yat Sin --- .../drm/amd/amdkfd/kfd_device_queue_manager.c | 60 +-- 1 file changed, 41 insertions(+), 19 deletions(-) diff --git a/drivers/g

[PATCH v3 15/16] drm/amdkfd: CRIU implement gpu_id remapping

2021-09-29 Thread David Yat Sin
When doing a restore on a different node, the gpu_id's on the restore node may be different. But the user space application will still refer use the original gpu_id's in the ioctl calls. Adding code to create a gpu id mapping so that kfd can determine actual gpu_id during the user ioctl's. Signed-

[PATCH v3 09/16] drm/amdkfd: CRIU restore queue ids

2021-09-29 Thread David Yat Sin
When re-creating queues during CRIU restore, restore the queue with the same queue id value used during CRIU dump. Signed-off-by: Rajneesh Bhardwaj Signed-off-by: David Yat Sin --- drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 2 +- drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c | 2 +- driv

[PATCH v3 13/16] drm/amdkfd: CRIU dump/restore queue control stack

2021-09-29 Thread David Yat Sin
Dump contents of queue control stacks on CRIU dump and restore them during CRIU restore. Signed-off-by: David Yat Sin Signed-off-by: Rajneesh Bhardwaj --- drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 2 +- drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c | 2 +- .../drm/amd/amdkfd/kfd_device_

[PATCH v3 14/16] drm/amdkfd: CRIU dump and restore events

2021-09-29 Thread David Yat Sin
Add support to existing CRIU ioctl's to save and restore events during criu checkpoint and restore. Signed-off-by: David Yat Sin --- drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 61 + drivers/gpu/drm/amd/amdkfd/kfd_events.c | 322 +-- drivers/gpu/drm/amd/amdkfd/kfd_priv.h

[PATCH v3 12/16] drm/amdkfd: CRIU dump and restore queue mqds

2021-09-29 Thread David Yat Sin
Dump contents of queue MQD's on CRIU dump and restore them during CRIU restore. Signed-off-by: David Yat Sin --- drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 2 +- drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c | 2 +- .../drm/amd/amdkfd/kfd_device_queue_manager.c | 72 -- .../dr

[PATCH v3 07/16] drm/amdkfd: CRIU Implement KFD pause ioctl

2021-09-29 Thread David Yat Sin
Introducing pause IOCTL. The CRIU amdgpu plugin is needs to call AMDKFD_IOC_CRIU_PAUSE(pause = 1) before starting dump and AMDKFD_IOC_CRIU_PAUSE(pause = 0) when dump is complete. This ensures that the queues are not modified between each CRIU dump ioctl. Signed-off-by: David Yat Sin --- drivers/

[PATCH v3 16/16] drm/amdkfd: CRIU export kfd bos as prime dmabuf objects

2021-09-29 Thread David Yat Sin
From: Rajneesh Bhardwaj KFD buffer objects do not associate a GEM handle with them so cannot directly be used with libdrm to initiate a system dma (sDMA) operation to speedup the checkpoint and restore operation so export them as dmabuf objects and use with libdrm helper (amdgpu_bo_import) to fur

[PATCH v3 08/16] drm/amdkfd: CRIU add queues support

2021-09-29 Thread David Yat Sin
Add support to existing CRIU ioctl's to save number of queues and queue properties for each queue during checkpoint and re-create queues on restore. Signed-off-by: David Yat Sin --- drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 16 +- drivers/gpu/drm/amd/amdkfd/kfd_priv.h | 25 +- ..

[PATCH v3 10/16] drm/amdkfd: CRIU restore sdma id for queues

2021-09-29 Thread David Yat Sin
When re-creating queues during CRIU restore, restore the queue with the same sdma id value used during CRIU dump. Signed-off-by: David Yat Sin --- .../drm/amd/amdkfd/kfd_device_queue_manager.c | 48 ++- .../drm/amd/amdkfd/kfd_device_queue_manager.h | 3 +- .../amd/amdkfd/kfd_pro

[PATCH v3 04/16] drm/amdkfd: CRIU Implement KFD dumper ioctl

2021-09-29 Thread David Yat Sin
From: Rajneesh Bhardwaj This adds support to discover the buffer objects that belong to a process being checkpointed. The data corresponding to these buffer objects is returned to user space plugin running under criu master context which then stores this info to recreate these buffer objects dur

[PATCH v3 00/16] CHECKPOINT RESTORE WITH ROCm

2021-09-29 Thread David Yat Sin
CRIU is a user space tool which is very popular for container live migration in datacentres. It can checkpoint a running application, save its complete state, memory contents and all system resources to images on disk which can be migrated to another m achine and restored later. More information

[PATCH v3 01/16] x86/configs: CRIU update debug rock defconfig

2021-09-29 Thread David Yat Sin
From: Rajneesh Bhardwaj - Update debug config for Checkpoint-Restore (CR) support - Also include necessary options for CR with docker containers. Signed-off-by: Rajneesh Bhardwaj Signed-off-by: David Yat Sin --- arch/x86/configs/rock-dbg_defconfig | 53 ++--- 1 file

[PATCH v3 05/16] drm/amdkfd: CRIU Implement KFD restore ioctl

2021-09-29 Thread David Yat Sin
From: Rajneesh Bhardwaj This implements the KFD CRIU Restore ioctl that lays the basic foundation for the CRIU restore operation. It provides support to create the buffer objects corresponding to Non-Paged system memory mapped for GPU and/or CPU access and lays basic foundation for the userptrs b

[PATCH v3 06/16] drm/amdkfd: CRIU Implement KFD resume ioctl

2021-09-29 Thread David Yat Sin
From: Rajneesh Bhardwaj This adds support to create userptr BOs on restore and introduces a new ioctl to restart memory notifiers for the restored userptr BOs. When doing CRIU restore MMU notifications can happen anytime after we call amdgpu_mn_register. Prevent MMU notifications until we reach s

[PATCH v3 02/16] drm/amdkfd: CRIU Introduce Checkpoint-Restore APIs

2021-09-29 Thread David Yat Sin
From: Rajneesh Bhardwaj Checkpoint-Restore in userspace (CRIU) is a powerful tool that can snapshot a running process and later restore it on same or a remote machine but expects the processes that have a device file (e.g. GPU) associated with them, provide necessary driver support to assist CRIU

[PATCH v3 03/16] drm/amdkfd: CRIU Implement KFD process_info ioctl

2021-09-29 Thread David Yat Sin
From: Rajneesh Bhardwaj This IOCTL is expected to be called as a precursor to the actual Checkpoint operation. This does the basic discovery into the target process seized by CRIU and relays the information to the userspace that utilizes it to start the Checkpoint operation via another dedicated

Re: [PATCH] drm/amd/pm: Fix that RPM cannot be obtained for specific GPU

2021-09-29 Thread Christian König
Am 28.09.21 um 23:50 schrieb Alex Deucher: On Tue, Sep 28, 2021 at 2:29 AM Christian König wrote: Am 28.09.21 um 02:49 schrieb huangyizhi: The current mechanism for obtaining RPM is to read tach_period from the register, and then calculate the RPM together with the frequency. But we found that

Re: [PATCH 61/64] drm/amdgpu: add support for SRIOV in IP discovery path

2021-09-29 Thread Christian König
Am 28.09.21 um 18:42 schrieb Alex Deucher: Handle SRIOV requirements when adding IP blocks. v2: add comment about UVD/VCE support on vega20 SR-IOV Signed-off-by: Alex Deucher Acked-by: Christian König --- drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c | 34 ++- 1 file ch

Re: [PATCH 59/64] drm/amdgpu: convert IP version array to include instances

2021-09-29 Thread Christian König
Am 28.09.21 um 18:42 schrieb Alex Deucher: Allow us to query instances versions more cleanly. Instancing support is not consistent unfortunately. SDMA is a good example. Sienna cichlid has 4 total SDMA instances, each enumerated separately (HWIDs 42, 43, 68, 69). Arcturus has 8 total SDMA inst

[PATCH] drm/amd/amdgpu: Do irq_fini_hw after ip_fini_early

2021-09-29 Thread YuBiao Wang
Some IP such as SMU need irq_put to perform hw_fini. So move irq_fini_hw after ip_fini. Signed-off-by: YuBiao Wang --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/

Re: [PATCH 52/64] drm/amdgpu: get VCN and SDMA instances from IP discovery table

2021-09-29 Thread Christian König
Am 28.09.21 um 18:42 schrieb Alex Deucher: Rather than hardcoding it. We already have the number of VCN instances from a previous patch, so just update the VCN instances for chips with static tables. v2: squash in checks for SDMA3,4 (Guchun) v3: clarify VCN changes Signed-off-by: Alex Deucher

Re: [PATCH 50/64] drm/amdgpu: add VCN1 hardware IP

2021-09-29 Thread Christian König
Am 28.09.21 um 18:42 schrieb Alex Deucher: So we can store the VCN IP revision for each instance of VCN. Signed-off-by: Alex Deucher Acked-by: Christian König --- drivers/gpu/drm/amd/amdgpu/amdgpu.h | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h

Re: [PATCH 31/64] drm/amdgpu/soc15: export common IP functions

2021-09-29 Thread Christian König
Am 28.09.21 um 18:42 schrieb Alex Deucher: So they can be driven by IP discovery table. Signed-off-by: Alex Deucher Acked-by: Christian König --- drivers/gpu/drm/amd/amdgpu/soc15.c | 2 +- drivers/gpu/drm/amd/amdgpu/soc15.h | 2 ++ 2 files changed, 3 insertions(+), 1 deletion(-) diff

Re: [PATCH 27/64] drm/amdgpu/nv: convert to IP version checking

2021-09-29 Thread Christian König
Am 28.09.21 um 18:42 schrieb Alex Deucher: Use IP versions rather than asic_type to differentiate IP version specific features. Signed-off-by: Alex Deucher Acked-by: Christian König --- drivers/gpu/drm/amd/amdgpu/nv.c | 75 + 1 file changed, 38 insertions

[PATCH] drm/amdkfd: fix a potential cu_mask memory leak

2021-09-29 Thread Lang Yu
If user doesn't explicitly call kfd_ioctl_destroy_queue to destroy all created queues, when the kfd process is destroyed, some queues' cu_mask memory are not freed. To avoid forgetting to free them in some places, free them immediately after use. Signed-off-by: Lang Yu --- drivers/gpu/drm/amd/a

[PATCH] drm/amdkfd: fix a potential ttm->sg memory leak

2021-09-29 Thread Lang Yu
Memory is allocated for ttm->sg by kmalloc in kfd_mem_dmamap_userptr, but isn't freed by kfree in kfd_mem_dmaunmap_userptr. Free it! Signed-off-by: Lang Yu --- drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_a

amdgpu driver halted on suspend of shutdown

2021-09-29 Thread 李真能
Hello:         When I do loop  auto test of reboot, I found  kernel may halt on memcpy_fromio of amdgpu's amdgpu_uvd_suspend, so I remove suspend process in amdgpu_pci_shutdown, and it will fix this bug. I have 3 questions to ask: 1. In amdgpu_pci_shutdown, the comment explains why we must e

RE: [PATCH] V2: drm/amdgpu: resolve RAS query bug

2021-09-29 Thread Zhang, Hawking
Reviewed-by: Hawking Zhang Regards, Hawking From: Clements, John Sent: Wednesday, September 29, 2021 15:03 To: Clements, John ; amd-gfx@lists.freedesktop.org; Zhang, Hawking Subject: RE: [PATCH] V2: drm/amdgpu: resolve RAS query bug [AMD Official Use Only] Updated patch with simpler solutio

RE: [PATCH] V2: drm/amdgpu: resolve RAS query bug

2021-09-29 Thread Clements, John
[AMD Official Use Only] Updated patch with simpler solution From: amd-gfx On Behalf Of Clements, John Sent: Wednesday, September 29, 2021 2:07 PM To: amd-gfx@lists.freedesktop.org; Zhang, Hawking Subject: [PATCH] drm/amdgpu: resolve RAS query bug [AMD Official Use Only] Submitting patch to

RE: [PATCH] drm/amdgpu: resolve RAS query bug

2021-09-29 Thread Zhang, Hawking
Thanks John! Let's try to use amdgpu_ras_query_error_status for that purpose Regards, Hawking From: Clements, John Sent: Wednesday, September 29, 2021 14:07 To: amd-gfx@lists.freedesktop.org; Zhang, Hawking Subject: [PATCH] drm/amdgpu: resolve RAS query bug [AMD Official Use Only] Submitting