RE: [PATCH v2] drm/amdgpu: Call amdgpu_device_unmap_mmio() iff device is unplugged to prevent crash in GPU initialization failure

2021-12-15 Thread Chen, Guchun
[Public] My BAD to misunderstand this. There are both spell typos in patch subject and body, s/iff/if. The patch is: Reviewed-by: Guchun Chen Please wait for the ack from Andrey and Christian before pushing this. Regards, Guchun -Original Message- From: Shi, Leslie Sent: Thursday,

RE: [PATCH v2] drm/amdgpu: Call amdgpu_device_unmap_mmio() iff device is unplugged to prevent crash in GPU initialization failure

2021-12-15 Thread Shi, Leslie
[Public] Hi Guchun, As Andrey says, "we should not call amdgpu_device_unmap_mmio unless device is unplugged", I think we should call amdgpu_device_unmap_mmio() only if device is unplugged (drm_dev_enter() return false) . +if (!drm_dev_enter(adev_to_drm(adev), &idx)) + amdgpu_device_unma

RE: [PATCH v2] drm/amdgpu: Call amdgpu_device_unmap_mmio() iff device is unplugged to prevent crash in GPU initialization failure

2021-12-15 Thread Chen, Guchun
[Public] Hi Leslie, I think we need to modify it like: +if (drm_dev_enter(adev_to_drm(adev), &idx)) { + amdgpu_device_unmap_mmio(adev); + drm_dev_exit(idx); +} Also you need to credit Andrey a 'suggested-by' in your patch. Regards, Guchun -Original Message- From: Shi, Lesl

[PATCH v2] drm/amdgpu: Call amdgpu_device_unmap_mmio() iff device is unplugged to prevent crash in GPU initialization failure

2021-12-15 Thread Leslie Shi
[Why] In amdgpu_driver_load_kms, when amdgpu_device_init returns error during driver modprobe, it will start the error handle path immediately and call into amdgpu_device_unmap_mmio as well to release mapped VRAM. However, in the following release callback, driver stills visits the unmapped memo

RE: [PATCH] drm/amd/pm: Fix xgmi link control on aldebaran

2021-12-15 Thread Quan, Evan
[AMD Official Use Only] Reviewed-by: Evan Quan > -Original Message- > From: amd-gfx On Behalf Of Lijo > Lazar > Sent: Wednesday, December 15, 2021 11:50 PM > To: amd-gfx@lists.freedesktop.org > Cc: Deucher, Alexander ; Yang, Stanley > ; Zhang, Hawking > Subject: [PATCH] drm/amd/pm: Fix

[PATCH] drm/amdgpu: fix mismatch warning between the prototype and function name

2021-12-15 Thread Huang Rui
Fix the typo to align with the prototype and function name. All warnings (new ones prefixed by >>): >> drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c:631: warning: expecting prototype for amdgpu_fence_clear_job_fences(). Prototype was for amdgpu_fence_driver_clear_job_fences() instead Reported-by: ke

Re: [PATCH v3] drm/amdgpu: introduce new amdgpu_fence object to indicate the job embedded fence

2021-12-15 Thread Huang Rui
Cpq14%3D&reserved=0 > git fetch --no-tags linux-review > Huang-Rui/drm-amdgpu-introduce-new-amdgpu_fence-object-to-indicate-the-job-embedded-fence/20211215-143731 > git checkout a47becf231b123760625c45242e89f5e5b5b4915 > # save the config file to linux buil

RE: [PATCH V5 00/16] Unified entry point for other blocks to interact with power

2021-12-15 Thread Quan, Evan
[AMD Official Use Only] Hi Lijo, Please check the latest series. All your comments were addressed except those about return value(EOPNOTSUPP) on api unimplemented. That I would like to handle separately(with follow-up patches). BR Evan > -Original Message- > From: Quan, Evan > Sent: Mo

[pull] amdgpu drm-fixes-5.16

2021-12-15 Thread Alex Deucher
Hi Dave, Daniel, Fixes for 5.16. The following changes since commit 2585cf9dfaaddf00b069673f27bb3f8530e2039c: Linux 5.16-rc5 (2021-12-12 14:53:01 -0800) are available in the Git repository at: https://gitlab.freedesktop.org/agd5f/linux.git tags/amd-drm-fixes-5.16-2021-12-15 for you to fe

回复: Re: [PATCH] drm/amdgpu: fixup bad vram size on gmc v8

2021-12-15 Thread 周宗敏
the problematic boards that I have tested is [AMD/ATI] Lexa PRO [Radeon RX 550/550X] ;  and the vbios version : 113-RXF9310-C09-BTWhen an exception occurs I can see the following changes in the values of vram size get from RREG32(mmCONFIG_MEMSIZE) ,it seems to have garbage in the upper 16 bits and

[PATCH] drm/amdgpu: add support for IP discovery gc_info table v2

2021-12-15 Thread Alex Deucher
Used on gfx9 based systems. Fixes incorrect CU counts reported in the kernel log. Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1833 Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c | 76 +-- drivers/gpu/drm/amd/include/discovery.h | 49 +

RE: [PATCH 1/2] drm/amdgpu: Separate vf2pf work item init from virt data exchange

2021-12-15 Thread Liu, Shaoyun
[AMD Official Use Only] Looks ok to me . This serial is Reviewed by: Shaoyun.liu Regards Shaoyun.liu -Original Message- From: amd-gfx On Behalf Of Victor Skvortsov Sent: Thursday, December 9, 2021 11:48 AM To: amd-gfx@lists.freedesktop.org Cc: Skvortsov, Victor Subject: [PATCH 1/2] d

[PATCH] drm/amdgpu: Filter security violation registers

2021-12-15 Thread Bokun Zhang
Recently, there is security policy update under SRIOV. We need to filter the registers that hit the violation and move the code to the host driver side so that the guest driver can execute correctly. Signed-off-by: Bokun Zhang Change-Id: Ida893bb17de17a80e865c7662f04c5562f5d2727 --- drivers/gpu/

RE: [PATCH 4/5] drm/amdgpu: Initialize Aldebaran RLC function pointers

2021-12-15 Thread Skvortsov, Victor
[AMD Official Use Only] Hey Alex, This change was based on the fact that amd-mainline-dkms-5.13 calls get_xgmi_info() in gmc_v9_0_early_init(). But I can see that drm-next it's instead called in gmc_v9_0_sw_init(). So, I'm not sure whats the correct behavior. But I do agree that the change is

Re: Various problems trying to vga-passthrough a Renoir iGPU to a xen/qubes-os hvm

2021-12-15 Thread Alex Deucher
Thinking about this more, I think the problem might be related to CPU access to "VRAM". APUs don't have dedicated VRAM, they use a reserved carve out region at the top of system memory. For CPU access to this memory, we kmap the physical address of the carve out region of system memory. You'll n

Re: [PATCH 4/5] drm/amdgpu: Initialize Aldebaran RLC function pointers

2021-12-15 Thread Alex Deucher
On Wed, Dec 15, 2021 at 1:56 PM Victor Skvortsov wrote: > > In SRIOV, RLC function pointers must be initialized early as > we rely on the RLCG interface for all GC register access. > > Signed-off-by: Victor Skvortsov > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c | 2 ++ > drivers/gpu/drm

Re: [PATCH 6/7] drm/amdgpu: Ensure kunmap is called on error

2021-12-15 Thread Ira Weiny
On Tue, Dec 14, 2021 at 08:09:29AM +0100, Christian König wrote: > Am 14.12.21 um 04:37 schrieb Ira Weiny: > > On Mon, Dec 13, 2021 at 09:37:32PM +0100, Christian König wrote: > > > Am 11.12.21 um 00:24 schrieb ira.we...@intel.com: > > > > From: Ira Weiny > > > > > > > > The default case leaves t

Re: [PATCH v4 4/6] drm: implement a method to free unused pages

2021-12-15 Thread Arunpravin
On 14/12/21 12:10 am, Matthew Auld wrote: > On 01/12/2021 16:39, Arunpravin wrote: >> On contiguous allocation, we round up the size >> to the *next* power of 2, implement a function >> to free the unused pages after the newly allocate block. >> >> v2(Matthew Auld): >>- replace function name

[PATCH v2 5/5] drm/amdgpu: Modify indirect register access for gfx9 sriov

2021-12-15 Thread Victor Skvortsov
Expand RLCG interface for new GC read & write commands. New interface will only be used if the PF enables the flag in pf2vf msg. v2: Added a description for the scratch registers Signed-off-by: Victor Skvortsov --- drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 117 -- 1 file c

[PATCH v2 4/5] drm/amdgpu: Initialize Aldebaran RLC function pointers

2021-12-15 Thread Victor Skvortsov
In SRIOV, RLC function pointers must be initialized early as we rely on the RLCG interface for all GC register access. v2: Make aldebaran a seperate case Signed-off-by: Victor Skvortsov --- drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c | 4 drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 3

[PATCH v2 1/5] drm/amdgpu: Add *_SOC15_IP_NO_KIQ() macro definitions

2021-12-15 Thread Victor Skvortsov
Add helper macros to change register access from direct to indirect. Signed-off-by: Victor Skvortsov --- drivers/gpu/drm/amd/amdgpu/soc15_common.h | 5 + 1 file changed, 5 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/soc15_common.h b/drivers/gpu/drm/amd/amdgpu/soc15_common.h index

[PATCH v2 2/5] drm/amdgpu: Modify indirect register access for gmc_v9_0 sriov

2021-12-15 Thread Victor Skvortsov
Modify GC register access from MMIO to RLCG if the indirect flag is set v2: Replaced ternary operator with if-else for better readability Signed-off-by: Victor Skvortsov --- drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 57 --- 1 file changed, 43 insertions(+), 14 deletions(-)

[PATCH v2 3/5] drm/amdgpu: Modify indirect register access for amdkfd_gfx_v9 sriov

2021-12-15 Thread Victor Skvortsov
Modify GC register access from MMIO to RLCG if the indirect flag is set Signed-off-by: Victor Skvortsov --- .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c | 27 +-- 1 file changed, 13 insertions(+), 14 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c b

[PATCH v2 0/5] *** GFX9 RLCG Interface modifications ***

2021-12-15 Thread Victor Skvortsov
This patchset introduces an expanded sriov RLCG interface. This interface will be used by Aldebaran in sriov mode for indirect GC register access during full access. v2: Added descriptions to scratch registers, and improved code readability. Victor Skvortsov (5): drm/amdgpu: Add *_SOC15_IP_NO_

Re: [PATCH v4 2/6] drm: improve drm_buddy_alloc function

2021-12-15 Thread Arunpravin
On 14/12/21 12:29 am, Matthew Auld wrote: > On 09/12/2021 15:47, Paneer Selvam, Arunpravin wrote: >> [AMD Official Use Only] >> >> Hi Matthew, >> >> Ping on this? > > No new comments from me :) I guess just a question of what we should do > with the selftests, and then ofc at some point being

Re: [PATCH] drm/amdgpu: fixup bad vram size on gmc v8

2021-12-15 Thread Alex Deucher
On Wed, Dec 15, 2021 at 10:31 AM Zongmin Zhou wrote: > > Some boards(like RX550) seem to have garbage in the upper > 16 bits of the vram size register. Check for > this and clamp the size properly. Fixes > boards reporting bogus amounts of vram. > > after add this patch,the maximum GPU VRAM size

Re: [PATCH 5/5] drm/amdgpu: Modify indirect register access for gfx9 sriov

2021-12-15 Thread Nieto, David M
[Public] Gotcha, Can you add prior to the implementation a small description on how the interface and the different scratch registers work? It may be easier to review with a clear idea of the operation. I know the earlier implementation did not include it, but now that we are modifying it, it

Re: [PATCH 3/5] drm/amdgpu: Modify indirect register access for amdkfd_gfx_v9 sriov

2021-12-15 Thread Nieto, David M
[AMD Official Use Only] Reviewed-by: David Nieto From: Skvortsov, Victor Sent: Wednesday, December 15, 2021 10:55 AM To: amd-gfx@lists.freedesktop.org ; Deng, Emily ; Liu, Monk ; Ming, Davis ; Liu, Shaoyun ; Zhou, Peng Ju ; Chen, JingWen ; Chen, Horace ; Niet

Re: [PATCH 1/5] drm/amdgpu: Add *_SOC15_IP_NO_KIQ() macro definitions

2021-12-15 Thread Nieto, David M
[AMD Official Use Only] Reviewed-by: David Nieto From: Skvortsov, Victor Sent: Wednesday, December 15, 2021 10:55 AM To: amd-gfx@lists.freedesktop.org ; Deng, Emily ; Liu, Monk ; Ming, Davis ; Liu, Shaoyun ; Zhou, Peng Ju ; Chen, JingWen ; Chen, Horace ; Niet

Re: [PATCH 2/5] drm/amdgpu: Modify indirect register access for gmc_v9_0 sriov

2021-12-15 Thread Nieto, David M
[AMD Official Use Only] I don't know what others may think, but this coding while correct: - WREG32_NO_KIQ(hub->vm_inv_eng0_req + - hub->eng_distance * eng, inv_req); + (vmhub == AMDGPU_GFXHUB_0) ? + WREG32_SOC15_IP_NO_

RE: [PATCH 5/5] drm/amdgpu: Modify indirect register access for gfx9 sriov

2021-12-15 Thread Skvortsov, Victor
[AMD Official Use Only] This was a bug in the original definition, but it functionally it makes no difference (in both cases the macros resolve to the same value). From: Nieto, David M Sent: Wednesday, December 15, 2021 2:16 PM To: Skvortsov, Victor ; amd-gfx@lists.freedesktop.org; Deng, Emily

Re: [PATCH 5/5] drm/amdgpu: Modify indirect register access for gfx9 sriov

2021-12-15 Thread Nieto, David M
[AMD Official Use Only] scratch_reg0 = adev->rmmio + (adev->reg_offset[GC_HWIP][0][mmSCRATCH_REG0_BASE_IDX] + mmSCRATCH_REG0)*4; scratch_reg1 = adev->rmmio + (adev->reg_offset[GC_HWIP][0][mmSCRATCH_REG1_BASE_IDX] + mmSCRATCH_REG1)*4; - scratch_reg2 = adev->rmmio + (adev-

Re: [PATCH 4/5] drm/amdgpu: Initialize Aldebaran RLC function pointers

2021-12-15 Thread Nieto, David M
[AMD Official Use Only] case IP_VERSION(9, 4, 1): case IP_VERSION(9, 4, 2): amdgpu_device_ip_block_add(adev, &gfx_v9_0_ip_block); + if (amdgpu_sriov_vf(adev) && adev->ip_versions[GC_HWIP][0] == IP_VERSION(9, 4, 2)) + gfx_v9_0_

[PATCH 4/5] drm/amdgpu: Initialize Aldebaran RLC function pointers

2021-12-15 Thread Victor Skvortsov
In SRIOV, RLC function pointers must be initialized early as we rely on the RLCG interface for all GC register access. Signed-off-by: Victor Skvortsov --- drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c | 2 ++ drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 3 +-- drivers/gpu/drm/amd/amdgpu/gfx_v

[PATCH 2/5] drm/amdgpu: Modify indirect register access for gmc_v9_0 sriov

2021-12-15 Thread Victor Skvortsov
Modify GC register access from MMIO to RLCG if the indirect flag is set Signed-off-by: Victor Skvortsov --- drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 45 +++ 1 file changed, 32 insertions(+), 13 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c b/drivers/gpu

[PATCH 5/5] drm/amdgpu: Modify indirect register access for gfx9 sriov

2021-12-15 Thread Victor Skvortsov
Expand RLCG interface for new GC read & write commands. New interface will only be used if the PF enables the flag in pf2vf msg. Signed-off-by: Victor Skvortsov --- drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 111 +++--- 1 file changed, 83 insertions(+), 28 deletions(-) diff --g

[PATCH 1/5] drm/amdgpu: Add *_SOC15_IP_NO_KIQ() macro definitions

2021-12-15 Thread Victor Skvortsov
Add helper macros to change register access from direct to indirect. Signed-off-by: Victor Skvortsov --- drivers/gpu/drm/amd/amdgpu/soc15_common.h | 5 + 1 file changed, 5 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/soc15_common.h b/drivers/gpu/drm/amd/amdgpu/soc15_common.h index

[PATCH 3/5] drm/amdgpu: Modify indirect register access for amdkfd_gfx_v9 sriov

2021-12-15 Thread Victor Skvortsov
Modify GC register access from MMIO to RLCG if the indirect flag is set Signed-off-by: Victor Skvortsov --- .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c | 27 +-- 1 file changed, 13 insertions(+), 14 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c b

[PATCH 0/5] *** GFX9 RLCG Interface modifications ***

2021-12-15 Thread Victor Skvortsov
This patchset introduces an expanded sriov RLCG interface. This interface will be used by Aldebaran in sriov mode for indirect GC register access during full access. Victor Skvortsov (5): drm/amdgpu: Add *_SOC15_IP_NO_KIQ() macro definitions drm/amdgpu: Modify indirect register access for gmc_

RE: [PATCH] drm/amd/pm: Fix xgmi link control on aldebaran

2021-12-15 Thread Zhang, Hawking
Reviewed-by: Hawking Zhang Regards, Hawking -Original Message- From: Lazar, Lijo Sent: Wednesday, December 15, 2021 23:50 To: amd-gfx@lists.freedesktop.org Cc: Zhang, Hawking ; Deucher, Alexander ; Yang, Stanley Subject: [PATCH] drm/amd/pm: Fix xgmi link control on aldebaran Fix the

[PATCH] drm/amd/pm: Fix xgmi link control on aldebaran

2021-12-15 Thread Lijo Lazar
Fix the message argument. 0: Allow power down 1: Disallow power down Signed-off-by: Lijo Lazar --- drivers/gpu/drm/amd/pm/swsmu/smu13/aldebaran_ppt.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu13/aldebaran_ppt.c b/drivers/

[PATCH] drm/amdkfd: use max() and min() to make code cleaner

2021-12-15 Thread cgel . zte
From: Changcheng Deng Use max() and min() in order to make code cleaner. Reported-by: Zeal Robot Signed-off-by: Changcheng Deng --- drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c b/drivers/gpu/

[PATCH] drm/amdgpu: fixup bad vram size on gmc v8

2021-12-15 Thread Zongmin Zhou
Some boards(like RX550) seem to have garbage in the upper 16 bits of the vram size register. Check for this and clamp the size properly. Fixes boards reporting bogus amounts of vram. after add this patch,the maximum GPU VRAM size is 64GB, otherwise only 64GB vram size will be used. Signed-off-b

Re: [PATCH] drm/amdgpu: add drm_dev_unplug() in GPU initialization failure to prevent crash

2021-12-15 Thread Andrey Grodzovsky
I think that we should not call amdgpu_device_unmap_mmio unless device is unplugged (as in amdgpu_pci_remove) because the point of this function is to prevent accesses to MMIO range the device was occupying before removal. There is no point to prevent MMIO accesses when init failed and we want

Re: [PATCH v3] drm/amdgpu: introduce new amdgpu_fence object to indicate the job embedded fence

2021-12-15 Thread kernel test robot
, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch] url: https://github.com/0day-ci/linux/commits/Huang-Rui/drm-amdgpu-introduce-new-amdgpu_fence-object-to-indicate-the-job-embedded-fence/20211215-143731 base: git://anongit.freedesktop.org/drm/dr

RE: [PATCH] drm/amdgpu: add drm_dev_unplug() in GPU initialization failure to prevent crash

2021-12-15 Thread Chen, Guchun
[Public] Hi Christian, Your question is a really good one. The patch to unmap MMOI in such early phase is from Andrey's patch: drm/amdgpu: Unmap all MMIO mappings. It's a patch half a year ago, and everything looks fine till this case. Regards, Guchun -Original Message- From: Koenig,

Re: [PATCH] drm/amdgpu: add drm_dev_unplug() in GPU initialization failure to prevent crash

2021-12-15 Thread Christian König
Am 15.12.21 um 09:46 schrieb Leslie Shi: [Why] In amdgpu_driver_load_kms, when amdgpu_device_init returns error during driver modprobe, it will start the error handle path immediately and call into amdgpu_device_unmap_mmio as well to release mapped VRAM. However, in the following release callba

Re: [PATCH v3] drm/amdgpu: introduce new amdgpu_fence object to indicate the job embedded fence

2021-12-15 Thread Christian König
Am 15.12.21 um 07:35 schrieb Huang Rui: The job embedded fence donesn't initialize the flags at dma_fence_init(). Then we will go a wrong way in amdgpu_fence_get_timeline_name callback and trigger a null pointer panic once we enabled the trace event here. So introduce new amdgpu_fence object to i

[PATCH] drm/amdgpu: add drm_dev_unplug() in GPU initialization failure to prevent crash

2021-12-15 Thread Leslie Shi
[Why] In amdgpu_driver_load_kms, when amdgpu_device_init returns error during driver modprobe, it will start the error handle path immediately and call into amdgpu_device_unmap_mmio as well to release mapped VRAM. However, in the following release callback, driver stills visits the unmapped memo