Re: Expecting to revert commit 55285e21f045 "fbdev/efifb: Release PCI device ..."

2021-12-20 Thread Christian König
Good morning guys, first of all get better soon Linus. I'm unfortunately not the best expert for runtime power management (Alex) nor display (Harry), but from the lack of their response I guess that they are already on vacation. So maybe take everything I explain here with a grain of salt.

Re: [RFC 2/6] drm/amdgpu: Move scheduler init to after XGMI is ready

2021-12-20 Thread Christian König
Am 20.12.21 um 22:51 schrieb Andrey Grodzovsky: On 2021-12-20 2:16 a.m., Christian König wrote: Am 17.12.21 um 23:27 schrieb Andrey Grodzovsky: Before we initialize schedulers we must know which reset domain are we in - for single device there iis a single domain per device and so single wq

Re: [RFC 3/6] drm/amdgpu: Fix crash on modprobe

2021-12-20 Thread Christian König
Am 20.12.21 um 20:22 schrieb Andrey Grodzovsky: On 2021-12-20 2:17 a.m., Christian König wrote: Am 17.12.21 um 23:27 schrieb Andrey Grodzovsky: Restrict jobs resubmission to suspend case only since schedulers not initialised yet on probe. Signed-off-by: Andrey Grodzovsky ---   drivers/gpu

RE: [PATCH] drm/amdgpu: save error count in RAS poison handler

2021-12-20 Thread Zhou1, Tao
[AMD Official Use Only] > -Original Message- > From: Yang, Stanley > Sent: Tuesday, December 21, 2021 2:05 PM > To: Zhou1, Tao ; amd-gfx@lists.freedesktop.org; Zhang, > Hawking ; Chai, Thomas > Subject: 回复: [PATCH] drm/amdgpu: save error count in RAS poison handler > > [AMD Official U

Re: [PATCH V5 00/16] Unified entry point for other blocks to interact with power

2021-12-20 Thread Lazar, Lijo
On 12/13/2021 9:22 AM, Evan Quan wrote: There are several problems with current power implementations: 1. Too many internal details are exposed to other blocks. Thus to interact with power, they need to know which power framework is used(powerplay vs swsmu) or even whether some API is

Re: [PATCH V5 13/16] drm/amd/pm: relocate the power related headers

2021-12-20 Thread Lazar, Lijo
On 12/13/2021 9:22 AM, Evan Quan wrote: Instead of centralizing all headers in the same folder. Separate them into different folders and place them among those source files those who really need them. Signed-off-by: Evan Quan Change-Id: Id74cb4c7006327ca7ecd22daf17321e417c4aa71 -- v1->v2:

Re: [PATCH V5 05/16] drm/amd/pm: do not expose those APIs used internally only in si_dpm.c

2021-12-20 Thread Lazar, Lijo
On 12/13/2021 9:22 AM, Evan Quan wrote: Move them to si_dpm.c instead. Signed-off-by: Evan Quan Change-Id: I288205cfd7c6ba09cfb22626ff70360d61ff0c67 -- v1->v2: - rename the API with "si_" prefix(Alex) v2->v3: - rename other data structures used only in si_dpm.c(Lijo) v3->v4: - renam

回复: [PATCH] drm/amdgpu: save error count in RAS poison handler

2021-12-20 Thread Yang, Stanley
[AMD Official Use Only] > +void amdgpu_umc_ras_fini(struct amdgpu_device *adev) { > + if (amdgpu_ras_is_supported(adev, AMDGPU_RAS_BLOCK__UMC) > && > + adev->umc.ras_if) { > + struct ras_common_if *ras_if = adev->umc.ras_if; > + struct ras_ih_if ih_i

[PATCH AUTOSEL 5.10 14/19] drm/amdgpu: correct the wrong cached state for GMC on PICASSO

2021-12-20 Thread Sasha Levin
From: Evan Quan [ Upstream commit 17c65d6fca844ee72a651944d8ce721e9040bf70 ] Pair the operations did in GMC ->hw_init and ->hw_fini. That can help to maintain correct cached state for GMC and avoid unintention gate operation dropping due to wrong cached state. BugLink: https://gitlab.freedeskto

[PATCH AUTOSEL 5.10 13/19] drm/amd/display: Reset DMCUB before HW init

2021-12-20 Thread Sasha Levin
From: Nicholas Kazlauskas [ Upstream commit 791255ca9fbe38042cfd55df5deb116dc11fef18 ] [Why] If the firmware wasn't reset by PSP or HW and is currently running then the firmware will hang or perform underfined behavior when we modify its firmware state underneath it. [How] Reset DMCUB before se

[PATCH AUTOSEL 5.15 19/29] drm/amdgpu: correct the wrong cached state for GMC on PICASSO

2021-12-20 Thread Sasha Levin
From: Evan Quan [ Upstream commit 17c65d6fca844ee72a651944d8ce721e9040bf70 ] Pair the operations did in GMC ->hw_init and ->hw_fini. That can help to maintain correct cached state for GMC and avoid unintention gate operation dropping due to wrong cached state. BugLink: https://gitlab.freedeskto

[PATCH AUTOSEL 5.15 18/29] drm/amd/display: Reset DMCUB before HW init

2021-12-20 Thread Sasha Levin
From: Nicholas Kazlauskas [ Upstream commit 791255ca9fbe38042cfd55df5deb116dc11fef18 ] [Why] If the firmware wasn't reset by PSP or HW and is currently running then the firmware will hang or perform underfined behavior when we modify its firmware state underneath it. [How] Reset DMCUB before se

RE: [PATCH] drm/amdgpu: drop redundant semicolon

2021-12-20 Thread Quan, Evan
[AMD Official Use Only] Reviewed-by: Evan Quan > -Original Message- > From: amd-gfx On Behalf Of > Guchun Chen > Sent: Monday, December 20, 2021 10:34 PM > To: amd-gfx@lists.freedesktop.org; Deucher, Alexander > ; Koenig, Christian > ; Pan, Xinhui > Cc: Chen, Guchun > Subject: [PATCH]

Re: [PATCH v2] drm/amd/display: move calcs folder into DML

2021-12-20 Thread isabbasso
On 2021-12-20 20:20, Isabella Basso wrote: > The calcs folder has FPU code on it, which should be isolated inside the > DML folder as per https://patchwork.freedesktop.org/series/93042/. > > This commit aims single-handedly to correct the location of such FPU > code and does not refactor any funct

[PATCH v2] drm/amd/display: move calcs folder into DML

2021-12-20 Thread Isabella Basso
The calcs folder has FPU code on it, which should be isolated inside the DML folder as per https://patchwork.freedesktop.org/series/93042/. This commit aims single-handedly to correct the location of such FPU code and does not refactor any functions. Signed-off-by: Isabella Basso --- drivers/gp

Re: [RFC 4/6] drm/amdgpu: Serialize non TDR gpu recovery with TDRs

2021-12-20 Thread Andrey Grodzovsky
On 2021-12-20 2:20 a.m., Christian König wrote: Am 17.12.21 um 23:27 schrieb Andrey Grodzovsky: Use reset domain wq also for non TDR gpu recovery trigers such as sysfs and RAS. We must serialize all possible GPU recoveries to gurantee no concurrency there. For TDR call the original recovery fu

Re: [RFC 2/6] drm/amdgpu: Move scheduler init to after XGMI is ready

2021-12-20 Thread Andrey Grodzovsky
On 2021-12-20 2:16 a.m., Christian König wrote: Am 17.12.21 um 23:27 schrieb Andrey Grodzovsky: Before we initialize schedulers we must know which reset domain are we in - for single device there iis a single domain per device and so single wq per device. For XGMI the reset domain spans the

Re: Expecting to revert commit 55285e21f045 "fbdev/efifb: Release PCI device ..."

2021-12-20 Thread Daniel Vetter
Adding more amdgpu folks. Smells like this runtime pm leak is papering over an issue in the amdgpu driver. Other bug reports are also only about amdgpu.ko it seems, would an unconditional pm_runtime_get_sync() in amdgpu_pci_probe() also work? If you do it before the drm_aperture_remove_conflicting

Re: [RFC 3/6] drm/amdgpu: Fix crash on modprobe

2021-12-20 Thread Andrey Grodzovsky
On 2021-12-20 2:17 a.m., Christian König wrote: Am 17.12.21 um 23:27 schrieb Andrey Grodzovsky: Restrict jobs resubmission to suspend case only since schedulers not initialised yet on probe. Signed-off-by: Andrey Grodzovsky ---   drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c | 2 +-   1 file chan

Re: [RFC 0/6] Define and use reset domain for GPU recovery in amdgpu

2021-12-20 Thread Andrey Grodzovsky
On 2021-12-20 12:06 p.m., Liu, Shaoyun wrote: [AMD Official Use Only] Hi , Andrey I actually has some concerns about this change . 1. on SRIOV configuration , the reset notify coming from host , and driver already trigger a work queue to handle the reset (check xgpu_*_mailbox_flr_work) ,

RE: [PATCH] amdgpu/pm: Modify sysfs pp_dpm_sclk to have only read premission in ONEVF mode

2021-12-20 Thread Russell, Kent
[AMD Official Use Only] > -Original Message- > From: amd-gfx On Behalf Of Marina > Nikolic > Sent: Monday, December 20, 2021 11:09 AM > To: amd-gfx@lists.freedesktop.org > Cc: Mitrovic, Milan ; Nikolic, Marina > ; Kitchen, Greg > Subject: [PATCH] amdgpu/pm: Modify sysfs pp_dpm_sclk to h

Re: [PATCH] drm/ttm: Don't inherit GEM object VMAs in child process

2021-12-20 Thread Bhardwaj, Rajneesh
On 12/20/2021 4:29 AM, Daniel Vetter wrote: On Fri, Dec 10, 2021 at 07:58:50AM +0100, Christian König wrote: Am 09.12.21 um 19:28 schrieb Felix Kuehling: Am 2021-12-09 um 10:30 a.m. schrieb Christian König: That still won't work. But I think we could do this change for the amdgpu mmap callb

RE: [RFC 0/6] Define and use reset domain for GPU recovery in amdgpu

2021-12-20 Thread Liu, Shaoyun
[AMD Official Use Only] Hi , Andrey I actually has some concerns about this change . 1. on SRIOV configuration , the reset notify coming from host , and driver already trigger a work queue to handle the reset (check xgpu_*_mailbox_flr_work) , is it a good idea to trigger another work queue

[PATCH] amdgpu/pm: Modify sysfs pp_dpm_sclk to have only read premission in ONEVF mode

2021-12-20 Thread Marina Nikolic
== Description == Due to security reasons setting through sysfs should only be allowed in passthrough mode. Options that are not mapped as SMU messages do not have any mechanizm to distinguish between passthorugh, onevf and mutivf usecase. A unified approach is needed. == Changes == This patch int

[PATCH] drm/amdgpu: drop redundant semicolon

2021-12-20 Thread Guchun Chen
A minor typo. Signed-off-by: Guchun Chen --- drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c index b13db855cc9a..580a5b387122 100644 --

RE: [PATCH 4/4] drm/amdgpu: Access the FRU on Aldebaran

2021-12-20 Thread Russell, Kent
That's fine, I can do that change and submit the series per Alex's RB. Kent > -Original Message- > From: Chen, Guchun > Sent: Friday, December 17, 2021 9:49 PM > To: Russell, Kent ; amd-gfx@lists.freedesktop.org > Cc: Russell, Kent > Subject: RE: [PATCH 4/4] drm/amdgpu: Access the FRU

RE: [PATCH 1/4] drm/amdgpu: Increase potential product_name to 64 characters

2021-12-20 Thread Russell, Kent
[AMD Official Use Only] Will do. Thanks! Kent > -Original Message- > From: Christian König > Sent: Saturday, December 18, 2021 9:36 AM > To: Russell, Kent ; amd-gfx@lists.freedesktop.org > Subject: Re: [PATCH 1/4] drm/amdgpu: Increase potential product_name to 64 > characters > > > >

RE: [PATCH] drm/amdkfd: correct sdma queue number in kfd device init (v2)

2021-12-20 Thread Kim, Jonathan
> -Original Message- > From: Sider, Graham > Sent: December 20, 2021 1:19 AM > To: Kim, Jonathan ; Chen, Guchun > ; amd-gfx@lists.freedesktop.org; Deucher, > Alexander ; Kuehling, Felix > > Subject: RE: [PATCH] drm/amdkfd: correct sdma queue number in kfd > device init (v2) > > [Publi

Re: [RFC 0/6] Define and use reset domain for GPU recovery in amdgpu

2021-12-20 Thread Daniel Vetter
On Mon, Dec 20, 2021 at 08:25:05AM +0100, Christian König wrote: > Am 17.12.21 um 23:27 schrieb Andrey Grodzovsky: > > This patchset is based on earlier work by Boris[1] that allowed to have an > > ordered workqueue at the driver level that will be used by the different > > schedulers to queue thei

Re: [PATCH] drm/ttm: Don't inherit GEM object VMAs in child process

2021-12-20 Thread Daniel Vetter
On Fri, Dec 10, 2021 at 07:58:50AM +0100, Christian König wrote: > Am 09.12.21 um 19:28 schrieb Felix Kuehling: > > Am 2021-12-09 um 10:30 a.m. schrieb Christian König: > > > That still won't work. > > > > > > But I think we could do this change for the amdgpu mmap callback only. > > If graphics u

[PATCH] drm/amd/display: Fix the uninitialized variable in enable_stream_features()

2021-12-20 Thread Yizhuo Zhai
In function enable_stream_features(), the variable "old_downspread.raw" could be uninitialized if core_link_read_dpcd() fails, however, it is used in the later if statement, and further, core_link_write_dpcd() may write random value, which is potentially unsafe. Fixes: 6016cd9dba0f ("drm/amd/displ

回复: Re: Re: Re: 回复: Re: [PATCH] drm/amdgpu: fixup bad vram size on gmc v8

2021-12-20 Thread 周宗敏
   Dear Alex:I've never tried to get a VBIOS before, so can you tell me how to  get a vbios image copy for you?I  try to google, just get the message that maybe can get from the following way:echo 1 > /sys/devices/pci:00/:00:02.0/romcat /sys/devices/pci:00/:00:02.0/rom > vbios.dump

Re: Potential Bug in drm/amd/display/dc_link

2021-12-20 Thread Yizhuo Zhai
Hi Harry: Thanks for your feedback, I will submit the patch for variable "old_downspread" in the function enable_stream_features(). And I double checked the code in the mainline and found that the buggy function wait_for_training_aux_rd_interval() has been removed, and the corresponding bug has bee

[PATCH] drm/amdgpu: save error count in RAS poison handler

2021-12-20 Thread Tao Zhou
Otherwise the RAS error count couldn't be queried from sysfs. Signed-off-by: Tao Zhou --- drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 2 +- drivers/gpu/drm/amd/amdgpu/amdgpu_umc.c| 170 - drivers/gpu/drm/amd/amdgpu/amdgpu_umc.h| 3 +- 3 files changed, 99 insertio