from:"Xu, Feifei"

RE: [PATCH] drm/amd/amdgpu: move drain_workqueue before shutdown is set

2024-08-25 Thread Xu, Feifei

[AMD Official Use Only - AMD Internal Distribution Only]

Reviewed-by: Feifei Xu 

-Original Message-
From: amd-gfx  On Behalf Of Victor Zhao
Sent: Monday, August 26, 2024 11:52 AM
To: amd-gfx@lists.freedesktop.org
Cc: Zhao, Victor 
Subject: [PATCH] drm/amd/amdgpu: move drain_workqueue before shutdown is set

[background] when unloading amdgpu driver right after running a workload, 
drain_workqueue is causing "Fence fallback timer expired on ring sdma0.0". 
Under sriov, this issue will cause sriov full access timeout and a reset 
happening.

move drain_workqueue before shutdown is set to allow ih process and before 
enter full access under sriov to avoid full access time cost.

Signed-off-by: Victor Zhao 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index da06705f0026..f06e1f408f20 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -4531,6 +4531,9 @@ void amdgpu_device_fini_hw(struct amdgpu_device *adev)  {
dev_info(adev->dev, "amdgpu: finishing device.\n");
flush_delayed_work(&adev->delayed_init_work);
+
+   if (adev->mman.initialized)
+   drain_workqueue(adev->mman.bdev.wq);
adev->shutdown = true;

/* make sure IB test finished before entering exclusive mode @@ -4551,9 
+4554,6 @@ void amdgpu_device_fini_hw(struct amdgpu_device *adev)
}
amdgpu_fence_driver_hw_fini(adev);

-   if (adev->mman.initialized)
-   drain_workqueue(adev->mman.bdev.wq);
-
if (adev->pm.sysfs_initialized)
amdgpu_pm_sysfs_fini(adev);
if (adev->ucode_sysfs_en)
--
2.34.1

RE: [PATCH 02/10] drm/amdgpu: Use init level for pending_reset flag

2024-09-03 Thread Xu, Feifei

[AMD Official Use Only - AMD Internal Distribution Only]

Comment inline.

Thanks,
Feifei

-Original Message-
From: amd-gfx  On Behalf Of Lijo Lazar
Sent: Monday, September 2, 2024 3:34 PM
To: amd-gfx@lists.freedesktop.org
Cc: Zhang, Hawking ; Deucher, Alexander 
; Koenig, Christian 
Subject: [PATCH 02/10] drm/amdgpu: Use init level for pending_reset flag

Drop pending_reset flag in gmc block. Instead use init level to determine which 
type of init is preferred - in this case MINIMAL.

Signed-off-by: Lijo Lazar 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c| 33 ---
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c   |  1 -
 drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h   |  1 -
 drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c   |  2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c  |  6 ++--
 .../gpu/drm/amd/pm/swsmu/smu11/smu_v11_0.c|  3 +-
 6 files changed, 13 insertions(+), 33 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 4fb09c4fbf22..db5046e8b10d 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -1691,7 +1691,7 @@ bool amdgpu_device_need_post(struct amdgpu_device *adev)
}

/* Don't post if we need to reset whole hive on init */
-   if (adev->gmc.xgmi.pending_reset)
+   if (adev->init_lvl->level == AMDGPU_INIT_LEVEL_MINIMAL)
return false;

if (adev->has_hw_reset) {
@@ -2985,7 +2985,7 @@ static int amdgpu_device_ip_init(struct amdgpu_device 
*adev)
amdgpu_ttm_set_buffer_funcs_status(adev, true);

/* Don't init kfd if whole hive need to be reset during init */
-   if (!adev->gmc.xgmi.pending_reset) {
+   if (adev->init_lvl->level != AMDGPU_INIT_LEVEL_MINIMAL) {
kgd2kfd_init_zone_device(adev);
amdgpu_amdkfd_device_init(adev);
}
@@ -3499,14 +3499,9 @@ static int amdgpu_device_ip_suspend_phase2(struct 
amdgpu_device *adev)
}

/* skip unnecessary suspend if we do not initialize them yet */
-   if (adev->gmc.xgmi.pending_reset &&
-   !(adev->ip_blocks[i].version->type == AMD_IP_BLOCK_TYPE_GMC 
||
- adev->ip_blocks[i].version->type == AMD_IP_BLOCK_TYPE_SMC 
||
- adev->ip_blocks[i].version->type == 
AMD_IP_BLOCK_TYPE_COMMON ||
- adev->ip_blocks[i].version->type == 
AMD_IP_BLOCK_TYPE_IH)) {
-   adev->ip_blocks[i].status.hw = false;
+   if (!amdgpu_ip_member_of_hwini(
+   adev, adev->ip_blocks[i].version->type))
continue;
-   }

[Feifei]:  AMDGPU_INIT_LEVEL_MINIMAL indicate the minimal necessary blocks 
which need to do hw_init if SMC need to handle the mode1 reset. Though in newer 
ASICs it is smc doing the reset, in some old one, it is MP0.
   Is it more readable if we use naming like 
AMDGPU_INIT_LEVEL_MINIMAL_SMC to avoid confusion ?


/* skip suspend of gfx/mes and psp for S0ix
 * gfx is in gfxoff state, so on resume it will exit gfxoff 
just @@ -4320,20 +4315,7 @@ int amdgpu_device_init(struct amdgpu_device *adev,
if (!amdgpu_sriov_vf(adev) && amdgpu_asic_need_reset_on_init(adev)) {
if (adev->gmc.xgmi.num_physical_nodes) {
dev_info(adev->dev, "Pending hive reset.\n");
-   adev->gmc.xgmi.pending_reset = true;
-   /* Only need to init necessary block for SMU to handle 
the reset */
-   for (i = 0; i < adev->num_ip_blocks; i++) {
-   if (!adev->ip_blocks[i].status.valid)
-   continue;
-   if (!(adev->ip_blocks[i].version->type == 
AMD_IP_BLOCK_TYPE_GMC ||
- adev->ip_blocks[i].version->type == 
AMD_IP_BLOCK_TYPE_COMMON ||
- adev->ip_blocks[i].version->type == 
AMD_IP_BLOCK_TYPE_IH ||
- adev->ip_blocks[i].version->type == 
AMD_IP_BLOCK_TYPE_SMC)) {
-   DRM_DEBUG("IP %s disabled for 
hw_init.\n",
-   
adev->ip_blocks[i].version->funcs->name);
-   adev->ip_blocks[i].status.hw = true;
-   }
-   }
+   amdgpu_set_init_level(adev, AMDGPU_INIT_LEVEL_MINIMAL);
} else if (amdgpu_ip_version(adev, MP1_HWIP, 0) == 
IP_VERSION(13, 0, 10) &&
   !amdgpu_device_has_display_hardware(adev)) {
r = psp_gpu_reset(adev);
@@ -4441,7 +4423,7 @@ int amdgpu_device_init(struct amdgpu_device *adev,
/* enable clockgating, etc. after ib tests

RE: [PATCH 02/10] drm/amdgpu: Use init level for pending_reset flag

2024-09-03 Thread Xu, Feifei

[AMD Official Use Only - AMD Internal Distribution Only]

>> It is kept this way now as the immediate purpose is to support 'minimal'
>>init required for XGMI-reset-on-init scenario for limited SOCs. In that 
>>sense, this could be renamed that way also.

Maybe adding comments of the MINIMAL like above explanation(minimal ip blocks 
required in xgmi-reset-on-init scenario) in [PATCH 01/10] drm/amdgpu: Add init 
levels , could be more readable.

Thanks,
Feifei

-Original Message-
From: Lazar, Lijo 
Sent: Wednesday, September 4, 2024 11:24 AM
To: Xu, Feifei ; amd-gfx@lists.freedesktop.org
Cc: Zhang, Hawking ; Deucher, Alexander 
; Koenig, Christian 
Subject: Re: [PATCH 02/10] drm/amdgpu: Use init level for pending_reset flag



On 9/4/2024 7:40 AM, Xu, Feifei wrote:
> [AMD Official Use Only - AMD Internal Distribution Only]
>
> Comment inline.
>
> Thanks,
> Feifei
>
> -Original Message-
> From: amd-gfx  On Behalf Of
> Lijo Lazar
> Sent: Monday, September 2, 2024 3:34 PM
> To: amd-gfx@lists.freedesktop.org
> Cc: Zhang, Hawking ; Deucher, Alexander
> ; Koenig, Christian
> 
> Subject: [PATCH 02/10] drm/amdgpu: Use init level for pending_reset
> flag
>
> Drop pending_reset flag in gmc block. Instead use init level to determine 
> which type of init is preferred - in this case MINIMAL.
>
> Signed-off-by: Lijo Lazar 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c| 33 ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c   |  1 -
>  drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h   |  1 -
>  drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c   |  2 +-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c  |  6 ++--
>  .../gpu/drm/amd/pm/swsmu/smu11/smu_v11_0.c|  3 +-
>  6 files changed, 13 insertions(+), 33 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index 4fb09c4fbf22..db5046e8b10d 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -1691,7 +1691,7 @@ bool amdgpu_device_need_post(struct amdgpu_device *adev)
> }
>
> /* Don't post if we need to reset whole hive on init */
> -   if (adev->gmc.xgmi.pending_reset)
> +   if (adev->init_lvl->level == AMDGPU_INIT_LEVEL_MINIMAL)
> return false;
>
> if (adev->has_hw_reset) {
> @@ -2985,7 +2985,7 @@ static int amdgpu_device_ip_init(struct amdgpu_device 
> *adev)
> amdgpu_ttm_set_buffer_funcs_status(adev, true);
>
> /* Don't init kfd if whole hive need to be reset during init */
> -   if (!adev->gmc.xgmi.pending_reset) {
> +   if (adev->init_lvl->level != AMDGPU_INIT_LEVEL_MINIMAL) {
> kgd2kfd_init_zone_device(adev);
> amdgpu_amdkfd_device_init(adev);
> }
> @@ -3499,14 +3499,9 @@ static int amdgpu_device_ip_suspend_phase2(struct 
> amdgpu_device *adev)
> }
>
> /* skip unnecessary suspend if we do not initialize them yet 
> */
> -   if (adev->gmc.xgmi.pending_reset &&
> -   !(adev->ip_blocks[i].version->type == 
> AMD_IP_BLOCK_TYPE_GMC ||
> - adev->ip_blocks[i].version->type == 
> AMD_IP_BLOCK_TYPE_SMC ||
> - adev->ip_blocks[i].version->type == 
> AMD_IP_BLOCK_TYPE_COMMON ||
> - adev->ip_blocks[i].version->type == 
> AMD_IP_BLOCK_TYPE_IH)) {
> -   adev->ip_blocks[i].status.hw = false;
> +   if (!amdgpu_ip_member_of_hwini(
> +   adev, adev->ip_blocks[i].version->type))
> continue;
> -   }
>
> [Feifei]:  AMDGPU_INIT_LEVEL_MINIMAL indicate the minimal necessary blocks 
> which need to do hw_init if SMC need to handle the mode1 reset. Though in 
> newer ASICs it is smc doing the reset, in some old one, it is MP0.
>Is it more readable if we use naming like 
> AMDGPU_INIT_LEVEL_MINIMAL_SMC to avoid confusion ?

Original intention for levels is like -

Define a single 'minimal' level init required for the SOC. Further 
levels like suspend, s0i3, emulation/simulation etc. may be introduced later 
which defines the level of initialization required for those scenarios. 
Basically, the idea was to make it SOC specific with a callback.

It is kept this way now as the immediate purpose is to support 'minimal'
init required for XGMI-reset-on-init scenario for limited SOCs. In that sense, 
this could be renamed that way also.

>
>
> /* skip suspend of gfx/mes and psp f

RE: [PATCH 01/10] drm/amdgpu: Add init levels

2024-09-04 Thread Xu, Feifei

[AMD Official Use Only - AMD Internal Distribution Only]

Comments inline.

Thanks,
Feifei

-Original Message-
From: amd-gfx  On Behalf Of Lijo Lazar
Sent: Monday, September 2, 2024 3:34 PM
To: amd-gfx@lists.freedesktop.org
Cc: Zhang, Hawking ; Deucher, Alexander 
; Koenig, Christian 
Subject: [PATCH 01/10] drm/amdgpu: Add init levels

Add init levels to define the level to which device needs to be initialized.

Signed-off-by: Lijo Lazar 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu.h| 14 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 54 ++
 2 files changed, 68 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index 6e6580ab7e04..fefdace22894 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -820,6 +820,16 @@ struct amdgpu_mqd {
struct amdgpu_mqd_prop *p);
 };

+enum amdgpu_init_lvl_id {
+   AMDGPU_INIT_LEVEL_DEFAULT,
+   AMDGPU_INIT_LEVEL_MINIMAL,
+};
+
+struct amdgpu_init_level {
+   enum amdgpu_init_lvl_id level;
+   uint32_t hwini_ip_block_mask;
+};
+
 #define AMDGPU_RESET_MAGIC_NUM 64
 #define AMDGPU_MAX_DF_PERFMONS 4
 struct amdgpu_reset_domain;
@@ -1169,6 +1179,8 @@ struct amdgpu_device {
boolenforce_isolation[MAX_XCP];
/* Added this mutex for cleaner shader isolation between GFX and 
compute processes */
struct mutexenforce_isolation_mutex;
+
+   struct amdgpu_init_level *init_lvl;
 };

 static inline uint32_t amdgpu_ip_version(const struct amdgpu_device *adev, @@ 
-1623,4 +1635,6 @@ extern const struct attribute_group 
amdgpu_vram_mgr_attr_group;  extern const struct attribute_group 
amdgpu_gtt_mgr_attr_group;  extern const struct attribute_group 
amdgpu_flash_attr_group;

+void amdgpu_set_init_level(struct amdgpu_device *adev,
+  enum amdgpu_init_lvl_id lvl);
 #endif
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 61a189e30bcd..4fb09c4fbf22 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -144,6 +144,42 @@ const char *amdgpu_asic_name[] = {
"LAST",
 };

+#define AMDGPU_IP_BLK_MASK_ALL GENMASK(AMDGPU_MAX_IP_NUM - 1, 0)
+
+struct amdgpu_init_level amdgpu_init_default = {
+   .level = AMDGPU_INIT_LEVEL_DEFAULT,
+   .hwini_ip_block_mask = AMDGPU_IP_BLK_MASK_ALL, };
+
+struct amdgpu_init_level amdgpu_init_minimal = {
+   .level = AMDGPU_INIT_LEVEL_MINIMAL,
+   .hwini_ip_block_mask =
+   BIT(AMD_IP_BLOCK_TYPE_GMC) | BIT(AMD_IP_BLOCK_TYPE_SMC) |
+   BIT(AMD_IP_BLOCK_TYPE_COMMON) | BIT(AMD_IP_BLOCK_TYPE_IH) };
+
+static inline bool amdgpu_ip_member_of_hwini(struct amdgpu_device *adev,
+enum amd_ip_block_type block) {
+   return (adev->init_lvl->hwini_ip_block_mask & (1U << block)) != 0; }
+
+void amdgpu_set_init_level(struct amdgpu_device *adev,
+  enum amdgpu_init_lvl_id lvl)
+{
+   switch (lvl) {
+   case AMDGPU_INIT_LEVEL_DEFAULT:
+   adev->init_lvl = &amdgpu_init_default;
+   break;
+   case AMDGPU_INIT_LEVEL_MINIMAL:
+   adev->init_lvl = &amdgpu_init_minimal;
+   break;
+   default:
+   adev->init_lvl = &amdgpu_init_default;
+   break;
+   }
+}
+
 static inline void amdgpu_device_stop_pending_resets(struct amdgpu_device 
*adev);

 /**
@@ -2633,6 +2669,9 @@ static int amdgpu_device_ip_hw_init_phase1(struct 
amdgpu_device *adev)
continue;
if (adev->ip_blocks[i].status.hw)
continue;
+   if (!amdgpu_ip_member_of_hwini(
+   adev, adev->ip_blocks[i].version->type))
+   continue;

[Feifei]: If xgmi-reset-init (mode1) not applicable to sriov VF, above check in 
amdgpu_device_ip_hw_init_phase1() is redundant.
  In amdgpu_device_ip_hw_init_phase1(), it only hw_init 2 ips in 
BM: AMD_IP_BLOCK_TYPE_COMMON/ AMD_IP_BLOCK_TYPE_IH
And add one more PSP in VF. In BM, both COMMON and IH are minimal 
supported ips.
But harmless to keep it here.

if (adev->ip_blocks[i].version->type == 
AMD_IP_BLOCK_TYPE_COMMON ||
(amdgpu_sriov_vf(adev) && (adev->ip_blocks[i].version->type 
== AMD_IP_BLOCK_TYPE_PSP)) ||
adev->ip_blocks[i].version->type == AMD_IP_BLOCK_TYPE_IH) { 
@@ -2658,6 +2697,9 @@ static int amdgpu_device_ip_hw_init_phase2(struct 
amdgpu_device *adev)
continue;
if (adev->ip_blocks[i].status.hw)
continue;
+   if (!amdgpu_ip_member_of_hwini(
+   adev, adev->ip_blocks[i].version->type))
+   continue;

RE: [PATCH 00/10] Support XGMI reset on init

2024-09-05 Thread Xu, Feifei

[AMD Official Use Only - AMD Internal Distribution Only]

Patch3~10:

Reviewed-by: Feifei Xu 

-Original Message-
From: amd-gfx  On Behalf Of Lijo Lazar
Sent: Monday, September 2, 2024 3:34 PM
To: amd-gfx@lists.freedesktop.org
Cc: Zhang, Hawking ; Deucher, Alexander 
; Koenig, Christian 
Subject: [PATCH 00/10] Support XGMI reset on init

There are case where a device needs to be reset first before it is fully 
initialized. An example case is a driver reinstallation with a different 
version of PSP TOS. In such a case, if a device supports reset in which PSP TOS 
is unloaded, then driver needs to reset device first and then load the new 
firmware components.

For devices in an XGMI hive, a reset needs to be sent on all devices in the 
hive. Thus driver should discover first devices that belong to a hive with PSP 
support.

There is an existing delayed reset handler, however it has the below
limitations-
1) It doesn't discover devices in the hive, instead it tries to do XGMI reset 
for all devices registered to mgpu struct. mgpu struct may have other devices 
than the one which belong to a hive. Also, if there is more than one hive, it 
doesn't work.
2) It doesn't take a reset lock and since this is a delayed reset, that could 
result in unwanted hardware accesses during a reset.
3) It doesn't initialize RAS properly (left as TODO)

This series overcomes the above limitations. Instead of marking a pending 
reset, init levels are defined where the level of initialization may be 
defined. In case of a pending reset, only specific hardware blocks may be 
initialized.

Further work (not done in this series) may be done to have fine grain controls 
for init levels - say skip enabling features like DPM enablement, or skip 
loading specific set of fimwares as they won't be required during a minimal 
init scenario where device is going to be reset.

The series adds an API interface to check if a PSP TOS reload is required.

Lijo Lazar (10):
  drm/amdgpu: Add init levels
  drm/amdgpu: Use init level for pending_reset flag
  drm/amdgpu: Separate reinitialization after reset
  drm/amdgpu: Add reset on init handler for XGMI
  drm/amdgpu: Add helper to initialize badpage info
  drm/amdgpu: Refactor XGMI reset on init handling
  drm/amdgpu: Drop delayed reset work handler
  drm/amdgpu: Support reset-on-init on select SOCs
  drm/amdgpu: Add interface for TOS reload cases
  drm/amdgpu: Add PSP reload case to reset-on-init

 drivers/gpu/drm/amd/amdgpu/aldebaran.c|   1 +
 drivers/gpu/drm/amd/amdgpu/amdgpu.h   |  21 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c| 245 +++---
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c   |  81 --
 drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h   |   1 -
 drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c   |  13 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h   |   3 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c   |  62 +++--
 drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h   |   4 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c | 148 +++
 drivers/gpu/drm/amd/amdgpu/amdgpu_reset.h |   4 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c  |  72 -
 drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h  |   2 +
 drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c |  14 +-
 drivers/gpu/drm/amd/amdgpu/psp_v13_0.c|  25 ++
 drivers/gpu/drm/amd/amdgpu/soc15.c|   7 +
 .../gpu/drm/amd/pm/swsmu/smu11/smu_v11_0.c|   3 +-
 17 files changed, 492 insertions(+), 214 deletions(-)

--
2.25.1

RE: [PATCH 2/2] drm/amdgpu/umsch: reinitialize write pointer in hw init

2024-03-25 Thread Xu, Feifei

[AMD Official Use Only - General]

Series is Reviewed-by: Feifei Xu 

-Original Message-
From: amd-gfx  On Behalf Of Lang Yu
Sent: Monday, March 25, 2024 1:37 PM
To: amd-gfx@lists.freedesktop.org
Cc: Deucher, Alexander ; Koenig, Christian 
; Gopalakrishnan, Veerabadhran (Veera) 
; Yu, Lang 
Subject: [PATCH 2/2] drm/amdgpu/umsch: reinitialize write pointer in hw init

Otherwise the old one will be used during GPU reset.
That's not expected.

Signed-off-by: Lang Yu 
---
 drivers/gpu/drm/amd/amdgpu/umsch_mm_v4_0.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/umsch_mm_v4_0.c 
b/drivers/gpu/drm/amd/amdgpu/umsch_mm_v4_0.c
index 84368cf1e175..bd57896ab85d 100644
--- a/drivers/gpu/drm/amd/amdgpu/umsch_mm_v4_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/umsch_mm_v4_0.c
@@ -225,6 +225,8 @@ static int umsch_mm_v4_0_ring_start(struct amdgpu_umsch_mm 
*umsch)

WREG32_SOC15(VCN, 0, regVCN_UMSCH_RB_SIZE, ring->ring_size);

+   ring->wptr = 0;
+
data = RREG32_SOC15(VCN, 0, regVCN_RB_ENABLE);
data &= ~(VCN_RB_ENABLE__AUDIO_RB_EN_MASK);
WREG32_SOC15(VCN, 0, regVCN_RB_ENABLE, data);
--
2.25.1

RE: [PATCH] Documentation: add a page on amdgpu debugging

2024-03-26 Thread Xu, Feifei

[AMD Official Use Only - General]

Reviewed-by: Feifei Xu 

-Original Message-
From: amd-gfx  On Behalf Of Alex Deucher
Sent: Saturday, March 16, 2024 12:45 AM
To: Deucher, Alexander 
Cc: amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH] Documentation: add a page on amdgpu debugging

On Fri, Mar 15, 2024 at 12:07 PM Alex Deucher  wrote:
>
> Covers GPU page fault debugging and adds a reference to umr.
>
> v2: update client ids to include SQC/G
>
> Signed-off-by: Alex Deucher 
> ---
>  Documentation/gpu/amdgpu/debugging.rst | 79 ++
>  Documentation/gpu/amdgpu/index.rst |  1 +
>  2 files changed, 80 insertions(+)
>  create mode 100644 Documentation/gpu/amdgpu/debugging.rst
>
> diff --git a/Documentation/gpu/amdgpu/debugging.rst
> b/Documentation/gpu/amdgpu/debugging.rst
> new file mode 100644
> index ..8b7fdcdf1158
> --- /dev/null
> +++ b/Documentation/gpu/amdgpu/debugging.rst
> @@ -0,0 +1,79 @@
> +===
> + GPU Debugging
> +===
> +
> +GPUVM Debugging
> +===
> +
> +To aid in debugging GPU virtual memory related problems, the driver
> +supports a number of options module paramters:
> +
> +`vm_fault_stop` - If non-0, halt the GPU memory controller on a GPU page 
> fault.
> +
> +`vm_update_mode` - If non-0, use the CPU to update GPU page tables
> +rather than the GPU.
> +
> +
> +Decoding a GPUVM Page Fault
> +===
> +
> +If you see a GPU page fault in the kernel log, you can decode it to
> +figure out what is going wrong in your application.  A page fault in
> +your kernel log may look something like this:
> +
> +::
> +
> + [gfxhub0] no-retry page fault (src_id:0 ring:24 vmid:3 pasid:32777, for 
> process glxinfo pid 2424 thread glxinfo:cs0 pid 2425)
> +   in page starting at address 0x80010280 from IH client 0x1b
> + (UTCL2)
> + VM_L2_PROTECTION_FAULT_STATUS:0x00301030
> +   Faulty UTCL2 client ID: TCP (0x8)
> +   MORE_FAULTS: 0x0
> +   WALKER_ERROR: 0x0
> +   PERMISSION_FAULTS: 0x3
> +   MAPPING_ERROR: 0x0
> +   RW: 0x0
> +
> +First you have the memory hub, gfxhub and mmhub.  gfxhub is the
> +memory hub used for graphics, compute, and sdma on some chips.  mmhub
> +is the memory hub used for multi-media and sdma on some chips.
> +
> +Next you have the vmid and pasid.  If the vmid is 0, this fault was
> +likely caused by the kernel driver or firmware.  If the vmid is
> +non-0, it is generally a fault in a user application.  The pasid is
> +used to link a vmid to a system process id.  If the process is active
> +when the fault happens, the process information will be printed.
> +
> +The GPU virtual address that caused the fault comes next.
> +
> +The client ID indicates the GPU block that caused the fault.
> +Some common client IDs:
> +
> +- CB/DB: The color/depth backend of the graphics pipe
> +- CPF: Command Processor Frontend
> +- CPC: Command Processor Compute
> +- CPG: Command Processor Graphics
> +- TCP/SQC/SQG: Shaders
> +- SDMA: SDMA engines
> +- VCN: Video encode/decode engines
> +- JPEG: JPEG engines
> +
> +PERMISSION_FAULTS describe what faults were encountered:
> +
> +- bit 0: the PTE was not valid
> +- bit 1: the PTE read bit was not set
> +- bit 2: the PTE write bit was not set
> +- bit 3: the PTE execute bit was not set
> +
> +Finally, RW, indicates whether the access was a read (0) or a write (1).
> +
> +In the example above, a shader (cliend id = TCP) generated a read (RW
> += 0x0) to an invalid page (PERMISSION_FAULTS = 0x3) at GPU virtual
> +address 0x80010280.  The user can then inspect can then
> +inspect their shader

removed the duplicated text above locally.

Alex

> +code and resource descriptor state to determine what caused the GPU page 
> fault.
> +
> +UMR
> +===
> +
> +`umr `_ is a general
> +purpose GPU debugging and diagnostics tool.  Please see the umr
> +documentation for more information about its capabilities.
> diff --git a/Documentation/gpu/amdgpu/index.rst
> b/Documentation/gpu/amdgpu/index.rst
> index 912e699fd373..847e04924030 100644
> --- a/Documentation/gpu/amdgpu/index.rst
> +++ b/Documentation/gpu/amdgpu/index.rst
> @@ -15,4 +15,5 @@ Next (GCN), Radeon DNA (RDNA), and Compute DNA (CDNA) 
> architectures.
> ras
> thermal
> driver-misc
> +   debugging
> amdgpu-glossary
> --
> 2.44.0
>

RE: [PATCH] drm/amdgpu: Fix pci state save during mode-1 reset

2024-06-18 Thread Xu, Feifei

[AMD Official Use Only - AMD Internal Distribution Only]

Reviewed-by: Feifei Xu 

-Original Message-
From: Zhang, Hawking 
Sent: Tuesday, June 18, 2024 7:34 PM
To: Lazar, Lijo ; amd-gfx@lists.freedesktop.org
Cc: Deucher, Alexander ; Kamal, Asad 
; Xu, Feifei 
Subject: RE: [PATCH] drm/amdgpu: Fix pci state save during mode-1 reset

[AMD Official Use Only - AMD Internal Distribution Only]

Reviewed-by: Hawking Zhang 

Regards,
Hawking
-Original Message-
From: Lazar, Lijo 
Sent: Tuesday, June 18, 2024 16:44
To: amd-gfx@lists.freedesktop.org
Cc: Zhang, Hawking ; Deucher, Alexander 
; Kamal, Asad ; Xu, Feifei 

Subject: [PATCH] drm/amdgpu: Fix pci state save during mode-1 reset

Cache the PCI state before bus master is disabled. The saved state is later 
used for other cases like restoring config space after mode-2 reset.

Signed-off-by: Lijo Lazar 
Fixes: 5c03e5843e6b ("drm/amdgpu:add smu mode1/2 support for aldebaran")
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 3fb02f5b91c9..6c2ab14ca102 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -5224,11 +5224,14 @@ int amdgpu_device_mode1_reset(struct amdgpu_device 
*adev)

dev_info(adev->dev, "GPU mode1 reset\n");

+   /* Cache the state before bus master disable. The saved config space
+* values are used in other cases like restore after mode-2 reset.
+*/
+   amdgpu_device_cache_pci_state(adev->pdev);
+
/* disable BM */
pci_clear_master(adev->pdev);

-   amdgpu_device_cache_pci_state(adev->pdev);
-
if (amdgpu_dpm_is_mode1_reset_supported(adev)) {
dev_info(adev->dev, "GPU smu mode1 reset\n");
ret = amdgpu_dpm_mode1_reset(adev);
--
2.25.1

RE: [PATCH] Revert "drm/amdkfd: fix gfx_target_version for certain 11.0.3 devices"

2024-05-26 Thread Xu, Feifei

[AMD Official Use Only - AMD Internal Distribution Only]

Reviewed-by: Feifei Xu 

-Original Message-
From: Alex Deucher 
Sent: Friday, May 24, 2024 2:44 AM
To: Deucher, Alexander 
Cc: amd-gfx@lists.freedesktop.org; Xu, Feifei 
Subject: Re: [PATCH] Revert "drm/amdkfd: fix gfx_target_version for certain 
11.0.3 devices"

Ping?

On Mon, May 20, 2024 at 2:52 PM Alex Deucher  wrote:
>
> This reverts commit 28ebbb4981cb1fad12e0b1227dbecc88810b1ee8.
>
> Revert this commit as apparently the LLVM code to take advantage of
> this never landed.
>
> Signed-off-by: Alex Deucher 
> Cc: Feifei Xu 
> ---
>  drivers/gpu/drm/amd/amdkfd/kfd_device.c | 11 ++-
>  1 file changed, 2 insertions(+), 9 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
> b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
> index 6b15e55811b69..fba9b9a258a50 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
> @@ -426,15 +426,8 @@ struct kfd_dev *kgd2kfd_probe(struct amdgpu_device 
> *adev, bool vf)
> f2g = &gfx_v11_kfd2kgd;
> break;
> case IP_VERSION(11, 0, 3):
> -   if ((adev->pdev->device == 0x7460 &&
> -adev->pdev->revision == 0x00) ||
> -   (adev->pdev->device == 0x7461 &&
> -adev->pdev->revision == 0x00))
> -   /* Note: Compiler version is 11.0.5 while HW 
> version is 11.0.3 */
> -   gfx_target_version = 110005;
> -   else
> -   /* Note: Compiler version is 11.0.1 while HW 
> version is 11.0.3 */
> -   gfx_target_version = 110001;
> +   /* Note: Compiler version is 11.0.1 while HW version 
> is 11.0.3 */
> +   gfx_target_version = 110001;
> f2g = &gfx_v11_kfd2kgd;
> break;
> case IP_VERSION(11, 5, 0):
> --
> 2.45.1
>

RE: [PATCH] drm/amdgpu: Don't show false warning for reg list

2024-06-03 Thread Xu, Feifei

[AMD Official Use Only - AMD Internal Distribution Only]

Reviewed-by: Feifei Xu 

-Original Message-
From: amd-gfx  On Behalf Of Lijo Lazar
Sent: Monday, June 3, 2024 2:58 PM
To: amd-gfx@lists.freedesktop.org
Cc: Zhang, Hawking ; Deucher, Alexander 
; Li, Candice 
Subject: [PATCH] drm/amdgpu: Don't show false warning for reg list

If reg list is already loaded on PSP 13.0.2 SOCs, psp will give TEE_ERR_CANCEL 
response on second time load. Avoid printing warn message for it.

Signed-off-by: Lijo Lazar 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 25 +  
drivers/gpu/drm/amd/amdgpu/psp_gfx_if.h |  5 +++--
 2 files changed, 24 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
index 6d1911773043..079feb139b16 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
@@ -643,6 +643,20 @@ static const char *psp_gfx_cmd_name(enum psp_gfx_cmd_id 
cmd_id)
}
 }

+static bool psp_err_warn(struct psp_context *psp) {
+   struct psp_gfx_cmd_resp *cmd = psp->cmd_buf_mem;
+
+   /* This response indicates reg list is already loaded */
+   if (amdgpu_ip_version(psp->adev, MP0_HWIP, 0) == IP_VERSION(13, 0, 2) &&
+   cmd->cmd_id == GFX_CMD_ID_LOAD_IP_FW &&
+   cmd->cmd.cmd_load_ip_fw.fw_type == GFX_FW_TYPE_REG_LIST &&
+   cmd->resp.status == TEE_ERROR_CANCEL)
+   return false;
+
+   return true;
+}
+
 static int
 psp_cmd_submit_buf(struct psp_context *psp,
   struct amdgpu_firmware_info *ucode, @@ -702,10 +716,13 @@ 
psp_cmd_submit_buf(struct psp_context *psp,
dev_warn(psp->adev->dev,
 "failed to load ucode %s(0x%X) ",
 amdgpu_ucode_name(ucode->ucode_id), 
ucode->ucode_id);
-   dev_warn(psp->adev->dev,
-"psp gfx command %s(0x%X) failed and response status 
is (0x%X)\n",
-psp_gfx_cmd_name(psp->cmd_buf_mem->cmd_id), 
psp->cmd_buf_mem->cmd_id,
-psp->cmd_buf_mem->resp.status);
+   if (psp_err_warn(psp))
+   dev_warn(
+   psp->adev->dev,
+   "psp gfx command %s(0x%X) failed and response 
status is (0x%X)\n",
+   psp_gfx_cmd_name(psp->cmd_buf_mem->cmd_id),
+   psp->cmd_buf_mem->cmd_id,
+   psp->cmd_buf_mem->resp.status);
/* If any firmware (including CAP) load fails under SRIOV, it 
should
 * return failure to stop the VF from initializing.
 * Also return failure in case of timeout diff --git 
a/drivers/gpu/drm/amd/amdgpu/psp_gfx_if.h 
b/drivers/gpu/drm/amd/amdgpu/psp_gfx_if.h
index 7566973ed8f5..37b5ddd6f13b 100644
--- a/drivers/gpu/drm/amd/amdgpu/psp_gfx_if.h
+++ b/drivers/gpu/drm/amd/amdgpu/psp_gfx_if.h
@@ -464,8 +464,9 @@ struct psp_gfx_rb_frame  #define PSP_ERR_UNKNOWN_COMMAND 
0x0100

 enum tee_error_code {
-TEE_SUCCESS = 0x,
-TEE_ERROR_NOT_SUPPORTED = 0x000A,
+   TEE_SUCCESS = 0x,
+   TEE_ERROR_CANCEL= 0x0002,
+   TEE_ERROR_NOT_SUPPORTED = 0x000A,
 };

 #endif /* _PSP_TEE_GFX_IF_H_ */
--
2.25.1

RE: [PATCH 2/2] drm/amdgpu/gfx11: set UNORD_DISPATCH in compute MQDs

2024-01-22 Thread Xu, Feifei

[AMD Official Use Only - General]

Series is Reviewed-by: Feifei Xu 

-Original Message-
From: amd-gfx  On Behalf Of Alex Deucher
Sent: Tuesday, January 23, 2024 3:47 AM
To: amd-gfx@lists.freedesktop.org
Cc: Deucher, Alexander ; sta...@vger.kernel.org
Subject: [PATCH 2/2] drm/amdgpu/gfx11: set UNORD_DISPATCH in compute MQDs

This needs to be set to 1 to avoid a potential deadlock in the GC 10.x and 
newer.  On GC 9.x and older, this needs to be set to 0. This can lead to hangs 
in some mixed graphics and compute workloads. Updated firmware is also required 
for AQL.

Signed-off-by: Alex Deucher 
Cc: sta...@vger.kernel.org
---
 drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c   | 2 +-
 drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v11.c | 1 +
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c
index 043eff309100..c1e10760 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c
@@ -3846,7 +3846,7 @@ static int gfx_v11_0_compute_mqd_init(struct 
amdgpu_device *adev, void *m,
(order_base_2(prop->queue_size / 4) - 1));
tmp = REG_SET_FIELD(tmp, CP_HQD_PQ_CONTROL, RPTR_BLOCK_SIZE,
(order_base_2(AMDGPU_GPU_PAGE_SIZE / 4) - 1));
-   tmp = REG_SET_FIELD(tmp, CP_HQD_PQ_CONTROL, UNORD_DISPATCH, 0);
+   tmp = REG_SET_FIELD(tmp, CP_HQD_PQ_CONTROL, UNORD_DISPATCH, 1);
tmp = REG_SET_FIELD(tmp, CP_HQD_PQ_CONTROL, TUNNEL_DISPATCH,
prop->allow_tunneling);
tmp = REG_SET_FIELD(tmp, CP_HQD_PQ_CONTROL, PRIV_STATE, 1); diff --git 
a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v11.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v11.c
index 15277f1d5cf0..d722cbd31783 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v11.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v11.c
@@ -224,6 +224,7 @@ static void update_mqd(struct mqd_manager *mm, void *mqd,
m->cp_hqd_pq_control = 5 << CP_HQD_PQ_CONTROL__RPTR_BLOCK_SIZE__SHIFT;
m->cp_hqd_pq_control |=
ffs(q->queue_size / sizeof(unsigned int)) - 1 - 1;
+   m->cp_hqd_pq_control |= CP_HQD_PQ_CONTROL__UNORD_DISPATCH_MASK;
pr_debug("cp_hqd_pq_control 0x%x\n", m->cp_hqd_pq_control);

m->cp_hqd_pq_base_lo = lower_32_bits((uint64_t)q->queue_address >> 8);
--
2.42.0

RE: [PATCH] drm/amdgpu: update documentation on new chips

2024-01-25 Thread Xu, Feifei

[AMD Official Use Only - General]

Acked-by: Feifei Xu 

-Original Message-
From: amd-gfx  On Behalf Of Alex Deucher
Sent: Friday, January 19, 2024 3:51 AM
To: amd-gfx@lists.freedesktop.org
Cc: Deucher, Alexander 
Subject: [PATCH] drm/amdgpu: update documentation on new chips

These have been released now, so add them to the documentation.

Signed-off-by: Alex Deucher 
---
 Documentation/gpu/amdgpu/dgpu-asic-info-table.csv | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/Documentation/gpu/amdgpu/dgpu-asic-info-table.csv 
b/Documentation/gpu/amdgpu/dgpu-asic-info-table.csv
index 882d2518f8ed..3825f00ca9fe 100644
--- a/Documentation/gpu/amdgpu/dgpu-asic-info-table.csv
+++ b/Documentation/gpu/amdgpu/dgpu-asic-info-table.csv
@@ -16,6 +16,7 @@ Radeon (RX|TM) (PRO|WX) Vega /MI25 /V320 /V340L /8200 /9100 
/SSG MxGPU, VEGA10,  AMD Radeon (Pro) VII /MI50 /MI60, VEGA20, DCE 12, 9.4.0, 
VCE 4.1.0 / UVD 7.2.0, 4.2.0  MI100, ARCTURUS, *, 9.4.1, VCN 2.5.0, 4.2.2  
MI200, ALDEBARAN, *, 9.4.2, VCN 2.6.0, 4.4.0
+MI300, AQUA_VANGARAM, *, 9.4.3, VCN 4.0.3, 4.4.2
 AMD Radeon (RX|Pro) 5600(M|XT) /5700 (M|XT|XTB) /W5700, NAVI10, DCN 2.0.0, 
10.1.10, VCN 2.0.0, 5.0.0  AMD Radeon (Pro) 5300 /5500XTB/5500(XT|M) /W5500M 
/W5500, NAVI14, DCN 2.0.0, 10.1.1, VCN 2.0.2, 5.0.2  AMD Radeon RX 6800(XT) 
/6900(XT) /W6800, SIENNA_CICHLID, DCN 3.0.0, 10.3.0, VCN 3.0.0, 5.2.0 @@ -23,4 
+24,5 @@ AMD Radeon RX 6700 XT / 6800M / 6700M, NAVY_FLOUNDER, DCN 3.0.0, 
10.3.2, VCN 3.0  AMD Radeon RX 6600(XT) /6600M /W6600 /W6600M, 
DIMGREY_CAVEFISH, DCN 3.0.2, 10.3.4, VCN 3.0.16, 5.2.4  AMD Radeon RX 6500M 
/6300M /W6500M /W6300M, BEIGE_GOBY, DCN 3.0.3, 10.3.5, VCN 3.0.33, 5.2.5  AMD 
Radeon RX 7900 XT /XTX, , DCN 3.2.0, 11.0.0, VCN 4.0.0, 6.0.0
+AMD Radeon RX 7800 XT, , DCN 3.2.0, 11.0.3, VCN 4.0.0, 6.0.3
 AMD Radeon RX 7600M (XT) /7700S /7600S, , DCN 3.2.1, 11.0.2, VCN 4.0.4, 6.0.2
--
2.42.0

RE: [PATCH] drm/amd/pm: Update aldebaran pmfw interface

2021-03-23 Thread Xu, Feifei



Reviewed-by: Feifei Xu 

From: Lazar, Lijo 
Sent: Tuesday, March 23, 2021 9:07 PM
To: amd-gfx@lists.freedesktop.org
Cc: Zhang, Hawking ; Xu, Feifei ; 
Feng, Kenneth ; Wang, Kevin(Yang) 
Subject: [PATCH] drm/amd/pm: Update aldebaran pmfw interface


[AMD Public Use]

Update aldebaran PMFW interfaces to version 0x6

Signed-off-by: Lijo Lazar lijo.la...@amd.com<mailto:lijo.la...@amd.com>
---
.../gpu/drm/amd/pm/inc/smu13_driver_if_aldebaran.h| 11 +--
drivers/gpu/drm/amd/pm/inc/smu_v13_0.h|  2 +-
2 files changed, 10 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/pm/inc/smu13_driver_if_aldebaran.h 
b/drivers/gpu/drm/amd/pm/inc/smu13_driver_if_aldebaran.h
index df2ead254f37..d23533bda002 100644
--- a/drivers/gpu/drm/amd/pm/inc/smu13_driver_if_aldebaran.h
+++ b/drivers/gpu/drm/amd/pm/inc/smu13_driver_if_aldebaran.h
@@ -435,8 +435,12 @@ typedef struct {
   uint8_t  GpioI2cSda; // Serial Data
   uint16_t spare5;

+  uint16_t XgmiMaxCurrent; // in Amps
+  int8_t   XgmiOffset; // in Amps
+  uint8_t  Padding_TelemetryXgmi;
+
   //reserved
-  uint32_t reserved[16];
+  uint32_t reserved[15];

 } PPTable_t;

@@ -481,7 +485,10 @@ typedef struct {
   uint16_t TemperatureAllHBM[4]  ;
   uint32_t GfxBusyAcc;
   uint32_t DramBusyAcc   ;
-  uint32_t Spare[4];
+  uint32_t EnergyAcc64bitLow ; //15.259uJ resolution
+  uint32_t EnergyAcc64bitHigh;
+  uint32_t TimeStampLow  ; //10ns resolution
+  uint32_t TimeStampHigh ;

   // Padding - ignore
   uint32_t MmHubPadding[8]; // SMU internal use
diff --git a/drivers/gpu/drm/amd/pm/inc/smu_v13_0.h 
b/drivers/gpu/drm/amd/pm/inc/smu_v13_0.h
index 6db3464c09d6..8145e1cbf181 100644
--- a/drivers/gpu/drm/amd/pm/inc/smu_v13_0.h
+++ b/drivers/gpu/drm/amd/pm/inc/smu_v13_0.h
@@ -26,7 +26,7 @@
#include "amdgpu_smu.h"

 #define SMU13_DRIVER_IF_VERSION_INV 0x
-#define SMU13_DRIVER_IF_VERSION_ALDE 0x5
+#define SMU13_DRIVER_IF_VERSION_ALDE 0x6

 /* MP Apertures */
#define MP0_Public  0x0380
--
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

RE: [PATCH] drm/amd/pm: Fix DPM level count on aldebaran

2021-03-25 Thread Xu, Feifei



Reviewed-by: Feifei Xu 

From: Lazar, Lijo 
Sent: Friday, March 26, 2021 2:04 PM
To: amd-gfx@lists.freedesktop.org
Cc: Zhang, Hawking ; Xu, Feifei ; 
Feng, Kenneth ; Wang, Kevin(Yang) 
Subject: [PATCH] drm/amd/pm: Fix DPM level count on aldebaran


[AMD Public Use]

Firmware returns zero-based max level, increment by one to get
total levels. This fixes the issue of not showing all levels and current
frequency when frequency is at max DPM level.

Signed-off-by: Lijo Lazar lijo.la...@amd.com<mailto:lijo.la...@amd.com>
---
drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c | 12 
1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c 
b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c
index 1f860969ea1c..30c9ac635105 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c
@@ -1710,10 +1710,14 @@ int smu_v13_0_get_dpm_level_count(struct smu_context 
*smu,
 enum 
smu_clk_type clk_type,
 uint32_t 
*value)
{
-  return smu_v13_0_get_dpm_freq_by_index(smu,
-   
  clk_type,
-   
  0xff,
-   
  value);
+ int ret;
+
+ ret = smu_v13_0_get_dpm_freq_by_index(smu, clk_type, 0xff, value);
+ /* FW returns 0 based max level, increment by one */
+ if (!ret && value)
+ ++(*value);
+
+ return ret;
}

 int smu_v13_0_set_single_dpm_table(struct smu_context *smu,
--
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

RE: [PATCH Review 1/1] drm/amdgpu: optimize gfx ras features flag clean

2021-04-20 Thread Xu, Feifei

[AMD Official Use Only - Internal Distribution Only]

Reviewed-by: Feifei Xu 

-Original Message-
From: Stanley.Yang 
Sent: Monday, April 19, 2021 5:44 PM
To: amd-gfx@lists.freedesktop.org
Cc: Zhang, Hawking ; Clements, John 
; Li, Dennis ; Xu, Feifei 
; Yang, Stanley 
Subject: [PATCH Review 1/1] drm/amdgpu: optimize gfx ras features flag clean

Signed-off-by: Stanley.Yang 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
index ec3ebc33ee03..8fdf355d7de8 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
@@ -657,11 +657,7 @@ static int __amdgpu_ras_feature_enable(struct 
amdgpu_device *adev,
 con->features |= BIT(head->block);
 } else {
 if (obj && amdgpu_ras_is_feature_enabled(adev, head)) {
-/* skip clean gfx ras context feature for VEGA20 Gaming.
- * will clean later
- */
-if (!(!adev->ras_features && con->features & BIT(AMDGPU_RAS_BLOCK__GFX)))
-con->features &= ~BIT(head->block);
+con->features &= ~BIT(head->block);
 put_obj(obj);
 }
 }
@@ -769,6 +765,10 @@ int amdgpu_ras_feature_enable_on_boot(struct amdgpu_device 
*adev,
 con->features |= BIT(head->block);

 ret = amdgpu_ras_feature_enable(adev, head, 0);
+
+/* clean gfx block ras features flag */
+if (adev->ras_features && head->block == AMDGPU_RAS_BLOCK__GFX)
+con->features &= ~BIT(head->block);
 }
 } else
 ret = amdgpu_ras_feature_enable(adev, head, enable);
--
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

RE: [PATCH] drm/amdgpu: correct sdma 4.x irq.num_types.

2021-04-25 Thread Xu, Feifei

Thanks. Will send V2 with the change.

Thanks,
Feifei

-Original Message-
From: Zhang, Hawking  
Sent: Sunday, April 25, 2021 3:13 PM
To: Xu, Feifei ; amd-gfx@lists.freedesktop.org
Cc: Xu, Feifei 
Subject: RE: [PATCH] drm/amdgpu: correct sdma 4.x irq.num_types.

[AMD Public Use]

Please split the patch into two so the commit description matches with the 
change.

Regards,
Hawking

-Original Message-
From: Feifei Xu 
Sent: Sunday, April 25, 2021 15:10
To: amd-gfx@lists.freedesktop.org
Cc: Zhang, Hawking ; Xu, Feifei 
Subject: [PATCH] drm/amdgpu: correct sdma 4.x irq.num_types.

Change the sdma interrupt info print level to debug.

Signed-off-by: Feifei Xu 
---
 drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c | 28 +++---
 1 file changed, 16 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c 
b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
index fbb701560ced..7870fd09d98d 100644
--- a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
@@ -2220,7 +2220,7 @@ static int sdma_v4_0_print_iv_entry(struct amdgpu_device 
*adev,
 
instance = sdma_v4_0_irq_id_to_seq(entry->client_id);
if (instance < 0 || instance >= adev->sdma.num_instances) {
-   dev_err_ratelimited(adev->dev, "sdma instance invalid %d\n", 
instance);
+   dev_err(adev->dev, "sdma instance invalid %d\n", instance);
return -EINVAL;
}
 
@@ -2230,7 +2230,7 @@ static int sdma_v4_0_print_iv_entry(struct amdgpu_device 
*adev,
memset(&task_info, 0, sizeof(struct amdgpu_task_info));
amdgpu_vm_get_task_info(adev, entry->pasid, &task_info);
 
-   dev_info_ratelimited(adev->dev,
+   dev_dbg_ratelimited(adev->dev,
   "[sdma%d] address:0x%016llx src_id:%u ring:%u vmid:%u "
   "pasid:%u, for process %s pid %d thread %s pid %d\n",
   instance, addr, entry->src_id, entry->ring_id, entry->vmid, 
@@ -2243,7 +2243,7 @@ static int sdma_v4_0_process_vm_hole_irq(struct 
amdgpu_device *adev,
  struct amdgpu_irq_src *source,
  struct amdgpu_iv_entry *entry)  {
-   dev_err_ratelimited(adev->dev, "MC or SEM address in VM hole\n");
+   dev_dbg_ratelimited(adev->dev, "MC or SEM address in VM hole\n");
sdma_v4_0_print_iv_entry(adev, entry);
return 0;
 }
@@ -2253,7 +2253,7 @@ static int sdma_v4_0_process_doorbell_invalid_irq(struct 
amdgpu_device *adev,
  struct amdgpu_iv_entry *entry)  {
 
-   dev_err_ratelimited(adev->dev, "SDMA received a doorbell from BIF with 
byte_enable !=0xff\n");
+   dev_dbg_ratelimited(adev->dev, "SDMA received a doorbell from BIF with 
+byte_enable !=0xff\n");
sdma_v4_0_print_iv_entry(adev, entry);
return 0;
 }
@@ -2262,7 +2262,7 @@ static int sdma_v4_0_process_pool_timeout_irq(struct 
amdgpu_device *adev,
  struct amdgpu_irq_src *source,
  struct amdgpu_iv_entry *entry)  {
-   dev_err_ratelimited(adev->dev,
+   dev_dbg_ratelimited(adev->dev,
"Polling register/memory timeout executing POLL_REG/MEM with 
finite timer\n");
sdma_v4_0_print_iv_entry(adev, entry);
return 0;
@@ -2272,7 +2272,7 @@ static int sdma_v4_0_process_srbm_write_irq(struct 
amdgpu_device *adev,
  struct amdgpu_irq_src *source,
  struct amdgpu_iv_entry *entry)  {
-   dev_err_ratelimited(adev->dev,
+   dev_dbg_ratelimited(adev->dev,
"SDMA gets an Register Write SRBM_WRITE command in 
non-privilege command buffer\n");
sdma_v4_0_print_iv_entry(adev, entry);
return 0;
@@ -2609,14 +2609,18 @@ static void sdma_v4_0_set_irq_funcs(struct 
amdgpu_device *adev)
case 5:
adev->sdma.trap_irq.num_types = AMDGPU_SDMA_IRQ_INSTANCE5;
adev->sdma.ecc_irq.num_types = AMDGPU_SDMA_IRQ_INSTANCE5;
+   adev->sdma.vm_hole_irq.num_types = AMDGPU_SDMA_IRQ_INSTANCE5;
+   adev->sdma.doorbell_invalid_irq.num_types = 
AMDGPU_SDMA_IRQ_INSTANCE5;
+   adev->sdma.pool_timeout_irq.num_types = 
AMDGPU_SDMA_IRQ_INSTANCE5;
+   adev->sdma.srbm_write_irq.num_types = AMDGPU_SDMA_IRQ_INSTANCE5;
break;
case 8:
-   adev->sdma.trap_irq.num_types = AMDGPU_SDMA_IRQ_LAST;
-   adev->sdma.ecc_irq.num_types = AMDGPU_SDMA_IRQ_LAST;
-   adev->sdma.vm_hole_irq.num_types = AMDGPU_SDMA_IRQ_INSTANCE5;
-   adev->sdma.doorbell_invalid_irq.num_types =

RE: [PATCH 1/2] drm/amdgpu: Correct sdma 4.x irq.num_types

2021-04-25 Thread Xu, Feifei

[AMD Official Use Only - Internal Distribution Only]

Please ignore this one. I made mistake on the instance 8. Will re-send the 
patch.

Thanks,
Feifei

-Original Message-
From: Feifei Xu 
Sent: Sunday, April 25, 2021 3:31 PM
To: amd-gfx@lists.freedesktop.org
Cc: Zhang, Hawking ; Xu, Feifei 
Subject: [PATCH 1/2] drm/amdgpu: Correct sdma 4.x irq.num_types

correct and init the sdma4.x irq.num_types.

Signed-off-by: Feifei Xu 
---
 drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c | 16 ++--
 1 file changed, 10 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c 
b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
index fbb701560ced..2800b1b1f2ff 100644
--- a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
@@ -2609,14 +2609,18 @@ static void sdma_v4_0_set_irq_funcs(struct 
amdgpu_device *adev)
 case 5:
 adev->sdma.trap_irq.num_types = AMDGPU_SDMA_IRQ_INSTANCE5;
 adev->sdma.ecc_irq.num_types = AMDGPU_SDMA_IRQ_INSTANCE5;
+adev->sdma.vm_hole_irq.num_types = AMDGPU_SDMA_IRQ_INSTANCE5;
+adev->sdma.doorbell_invalid_irq.num_types = AMDGPU_SDMA_IRQ_INSTANCE5;
+adev->sdma.pool_timeout_irq.num_types = AMDGPU_SDMA_IRQ_INSTANCE5;
+adev->sdma.srbm_write_irq.num_types = AMDGPU_SDMA_IRQ_INSTANCE5;
 break;
 case 8:
-adev->sdma.trap_irq.num_types = AMDGPU_SDMA_IRQ_LAST;
-adev->sdma.ecc_irq.num_types = AMDGPU_SDMA_IRQ_LAST;
-adev->sdma.vm_hole_irq.num_types = AMDGPU_SDMA_IRQ_INSTANCE5;
-adev->sdma.doorbell_invalid_irq.num_types = AMDGPU_SDMA_IRQ_LAST;
-adev->sdma.pool_timeout_irq.num_types = AMDGPU_SDMA_IRQ_LAST;
-adev->sdma.srbm_write_irq.num_types = AMDGPU_SDMA_IRQ_LAST;
+adev->sdma.trap_irq.num_types = AMDGPU_SDMA_IRQ_INSTANCE7;
+adev->sdma.ecc_irq.num_types = AMDGPU_SDMA_IRQ_INSTANCE7;
+adev->sdma.vm_hole_irq.num_types = AMDGPU_SDMA_IRQ_INSTANCE7;
+adev->sdma.doorbell_invalid_irq.num_types = AMDGPU_SDMA_IRQ_INSTANCE7;
+adev->sdma.pool_timeout_irq.num_types = AMDGPU_SDMA_IRQ_INSTANCE7;
+adev->sdma.srbm_write_irq.num_types = AMDGPU_SDMA_IRQ_INSTANCE7;
 break;
 case 2:
 default:
--
2.25.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

RE: [PATCH Review 1/1] drm/amdgpu: force enable gfx ras for vega20 ws

2021-04-30 Thread Xu, Feifei

[AMD Official Use Only - Internal Distribution Only]

Maybe the title can be more specific like:
drm/amdgpu: force enable gfx ras in hw_support for vega20 ws

With above modified.

Reviewed-by: Feifei Xu 

-Original Message-
From: amd-gfx  On Behalf Of Stanley.Yang
Sent: Friday, April 30, 2021 2:52 PM
To: amd-gfx@lists.freedesktop.org; Zhang, Hawking 
Cc: Yang, Stanley 
Subject: [PATCH Review 1/1] drm/amdgpu: force enable gfx ras for vega20 ws

Signed-off-by: Stanley.Yang 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 22 ++
 1 file changed, 22 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
index daf63a4c1fff..dfeaa57dd7ea 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
@@ -34,6 +34,7 @@
 #include "amdgpu_xgmi.h"
 #include "ivsrcid/nbio/irqsrcs_nbif_7_4.h"
 #include 
+#include "atom.h"

 static const char *RAS_FS_NAME = "ras";

@@ -2070,6 +2071,25 @@ static bool amdgpu_ras_asic_supported(struct 
amdgpu_device *adev)
 adev->asic_type == CHIP_SIENNA_CICHLID;  }

+/*
+ * this is workaround for vega20 workstation sku,
+ * force enable gfx ras, ignore vbios gfx ras flag
+ * due to GC EDC can not write
+ */
+static void amdgpu_ras_get_quirks(struct amdgpu_device *adev,
+uint32_t *hw_supported)
+{
+struct atom_context *ctx = adev->mode_info.atom_context;
+
+if (!ctx)
+return;
+
+if (adev->asic_type == CHIP_VEGA20 &&
+strnstr(ctx->vbios_version, "D16406",
+sizeof(ctx->vbios_version)))
+*hw_supported |= (1 << AMDGPU_RAS_BLOCK__GFX); }
+
 /*
  * check hardware's ras ability which will be saved in hw_supported.
  * if hardware does not support ras, we can skip some ras initializtion and @@ 
-2112,6 +2132,8 @@ static void amdgpu_ras_check_supported(struct amdgpu_device 
*adev,
 1 << AMDGPU_RAS_BLOCK__MMHUB);
 }

+amdgpu_ras_get_quirks(adev, hw_supported);
+
 /* hw_supported needs to be aligned with RAS block mask. */
 *hw_supported &= AMDGPU_RAS_BLOCK_MASK;

--
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&data=04%7C01%7CFeifei.Xu%40amd.com%7C3d36609b065148a3a8ba08d90ba47b2c%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637553623343892793%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=oYJsfbioPXlwBX7TsfruEbU7tVhaS1gG%2FEuwyeqjPXU%3D&reserved=0
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

RE: [PATCH] drm/amd/pm: fix return value in aldebaran_set_mp1_state()

2021-05-20 Thread Xu, Feifei

[AMD Official Use Only]

PP_MP1_STATE_NONE should be valid while other 2 are not for Aldebaran for now. 
It depends on whether driver want to throw out error if the msg fall out of the 
4 cases.
Benefit of handling all the cases will be catching potential bugs easily. But 
the handling of these 4 and others are the same in both driver and PMFW - which 
should skip and return 0.
So I am ok with the simplify code logic. Will take your suggestion which return 
0 for default. Thanks.

Thanks,
Feifei

-Original Message-
From: Lazar, Lijo 
Sent: Thursday, May 20, 2021 7:19 PM
To: Xu, Feifei ; amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/amd/pm: fix return value in aldebaran_set_mp1_state()

This now handles all of the states. Those states are not valid (and therefore 
not handled) for Aldebaran. If the intent is to skip handling of any other 
state, may be just return 0 for default or skip default altogether so that it 
falls through to return 0 for any other state.

In any case,

Reviewed-by: Lijo Lazar 

On 5/20/2021 3:20 PM, Feifei Xu wrote:
> We should just return error in invalid case. For valid but not
> implemented one, do nothing and return 0. Otherwise resume will abort
> because of the wrong return value.
>
> Signed-off-by: Feifei Xu 
> ---
>   drivers/gpu/drm/amd/pm/swsmu/smu13/aldebaran_ppt.c | 6 --
>   1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu13/aldebaran_ppt.c
> b/drivers/gpu/drm/amd/pm/swsmu/smu13/aldebaran_ppt.c
> index 5d04a1dfdfd8..5fcfd8e1a548 100644
> --- a/drivers/gpu/drm/amd/pm/swsmu/smu13/aldebaran_ppt.c
> +++ b/drivers/gpu/drm/amd/pm/swsmu/smu13/aldebaran_ppt.c
> @@ -1781,13 +1781,15 @@ static int aldebaran_set_mp1_state(struct smu_context 
> *smu,
>  enum pp_mp1_state mp1_state)
>   {
>   switch (mp1_state) {
> + case PP_MP1_STATE_NONE:
> + case PP_MP1_STATE_RESET:
> + case PP_MP1_STATE_SHUTDOWN:
> + return 0;
>   case PP_MP1_STATE_UNLOAD:
>   return smu_cmn_set_mp1_state(smu, mp1_state);
>   default:
>   return -EINVAL;
>   }
> -
> - return 0;
>   }
>
>   static const struct pptable_funcs aldebaran_ppt_funcs = {
>

--
Thanks,
Lijo
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

RE: [PATCH] drm/amdgpu: Set ttm caching flags during bo allocation

2021-06-29 Thread Xu, Feifei

[AMD Official Use Only]

Reviewed-by: Feifei Xu 

-Original Message-
From: amd-gfx  On Behalf Of Oak Zeng
Sent: Tuesday, June 29, 2021 7:16 AM
To: amd-gfx@lists.freedesktop.org
Cc: Kuehling, Felix ; Zhu, James ; 
Koenig, Christian ; Zeng, Oak 
Subject: [PATCH] drm/amdgpu: Set ttm caching flags during bo allocation

The ttm caching flags (ttm_cached, ttm_write_combined etc) are used to 
determine a buffer object's mapping attributes in both CPU page table and GPU 
page table (when that buffer is also accessed by GPU). Currently the ttm 
caching flags are set in function amdgpu_ttm_io_mem_reserve which is called 
during DRM_AMDGPU_GEM_MMAP ioctl. This has a problem since the GPU mapping of 
the buffer object (ioctl DRM_AMDGPU_GEM_VA) can happen earlier than the mmap 
time, thus the GPU page table update code can't pick up the right ttm caching 
flags to decide the right GPU page table attributes.

This patch moves the ttm caching flags setting to function amdgpu_vram_mgr_new 
- this function is called during the first step of a buffer object create (eg, 
DRM_AMDGPU_GEM_CREATE) so the later both CPU and GPU mapping function calls 
will pick up this flag for CPU/GPU page table set up.

Signed-off-by: Oak Zeng 
Suggested-by: Christian Koenig 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c  | 4 
 drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c | 5 +
 2 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index 6297363..93acf6f 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -607,10 +607,6 @@ static int amdgpu_ttm_io_mem_reserve(struct ttm_device 
*bdev,

mem->bus.offset += adev->gmc.aper_base;
mem->bus.is_iomem = true;
-   if (adev->gmc.xgmi.connected_to_cpu)
-   mem->bus.caching = ttm_cached;
-   else
-   mem->bus.caching = ttm_write_combined;
break;
default:
return -EINVAL;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
index a52e17e..6cb66eb 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
@@ -454,6 +454,11 @@ static int amdgpu_vram_mgr_new(struct ttm_resource_manager 
*man,
if (i == 1)
mem->placement |= TTM_PL_FLAG_CONTIGUOUS;

+   if (adev->gmc.xgmi.connected_to_cpu)
+   mem->bus.caching = ttm_cached;
+   else
+   mem->bus.caching = ttm_write_combined;
+
atomic64_add(vis_usage, &mgr->vis_usage);
mem->mm_node = nodes;
return 0;
--
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&data=04%7C01%7CFeifei.Xu%40amd.com%7Caef2cb54c9a4489626e808d93a8aab8c%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637605189533617890%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=Pzl%2FSSuBHjSUjBFQVffr3UneV8hzrVb2cmN8cSnifaA%3D&reserved=0
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

RE: [PATCH] drm/amd/pm: drop redundant MEM_TYPE_* macros

2020-08-14 Thread Xu, Feifei

[AMD Official Use Only - Internal Distribution Only]

Reviewed-by: Feifei Xu 

-Original Message-
From: amd-gfx  On Behalf Of Evan Quan
Sent: Friday, August 14, 2020 4:44 PM
To: amd-gfx@lists.freedesktop.org
Cc: Deucher, Alexander ; Quan, Evan 

Subject: [PATCH] drm/amd/pm: drop redundant MEM_TYPE_* macros

As these are already defined in amdgpu_atombios.h. Otherwise, we may hit 
"redefined" compile warning.

Change-Id: Ia2a9e10b35173fedcbbd8e0abb8ad38dd231baf4
Signed-off-by: Evan Quan 
---
 drivers/gpu/drm/amd/pm/powerplay/hwmgr/ppatomctrl.h | 9 -
 1 file changed, 9 deletions(-)

diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/ppatomctrl.h 
b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/ppatomctrl.h
index 3ee54f182943..76ed2e413594 100644
--- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/ppatomctrl.h
+++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/ppatomctrl.h
@@ -26,15 +26,6 @@

 #include "hwmgr.h"

-#define MEM_TYPE_GDDR5  0x50
-#define MEM_TYPE_GDDR4  0x40
-#define MEM_TYPE_GDDR3  0x30
-#define MEM_TYPE_DDR2   0x20
-#define MEM_TYPE_GDDR1  0x10
-#define MEM_TYPE_DDR3   0xb0
-#define MEM_TYPE_MASK   0xF0
-
-
 /* As returned from PowerConnectorDetectionTable. */  #define 
PP_ATOM_POWER_BUDGET_DISABLE_OVERDRIVE  0x80
 #define PP_ATOM_POWER_BUDGET_SHOW_WARNING   0x40
--
2.28.0

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&data=02%7C01%7CFeifei.Xu%40amd.com%7Cb57418b0630d47f141bd08d8402e3bdc%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637329914622450013&sdata=uuHzhQ7DDnANxqLtUhx%2FllmcFXg%2F1qPNzhQPa93IFdI%3D&reserved=0
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

RE: [PATCH] drm/amd/display: Fix a list corruption

2020-09-01 Thread Xu, Feifei

[AMD Official Use Only - Internal Distribution Only]

Acked-by: Feifei Xu 

-Original Message-
From: amd-gfx  On Behalf Of Pan, Xinhui
Sent: Tuesday, September 1, 2020 3:58 PM
To: amd-gfx@lists.freedesktop.org
Cc: Deucher, Alexander 
Subject: [PATCH] drm/amd/display: Fix a list corruption

[AMD Official Use Only - Internal Distribution Only]

[AMD Official Use Only - Internal Distribution Only]

Remove the private obj from the internal list before we free aconnector.

[   56.925828] BUG: unable to handle page fault for address: 8f84a870a560
[   56.933272] #PF: supervisor read access in kernel mode
[   56.938801] #PF: error_code(0x) - not-present page
[   56.944376] PGD 18e605067 P4D 18e605067 PUD 86a614067 PMD 86a4d0067 PTE 
8008578f5060
[   56.953260] Oops:  [#1] SMP DEBUG_PAGEALLOC NOPTI
[   56.958815] CPU: 6 PID: 1407 Comm: bash Tainted: G   O  
5.9.0-rc2+ #46
[   56.967092] Hardware name: System manufacturer System Product Name/PRIME 
Z390-A, BIOS 1401 11/26/2019
[   56.977162] RIP: 0010:__list_del_entry_valid+0x31/0xa0
[   56.982768] Code: 00 ad de 55 48 8b 17 4c 8b 47 08 48 89 e5 48 39 c2 74 27 
48 b8 22 01 00 00 00 00 ad de 49 39 c0 74 2d 49 8b 30 48 39 fe 75 3d <48> 8b 52 
08 48 39 f2 75 4c b8 01 00 00 00 5d c3 48 89 7
[   57.003327] RSP: 0018:b40c81687c90 EFLAGS: 00010246
[   57.009048] RAX: dead0122 RBX: 8f84ea41f4f0 RCX: 0006
[   57.016871] RDX: 8f84a870a558 RSI: 8f84ea41f4f0 RDI: 8f84ea41f4f0
[   57.024672] RBP: b40c81687c90 R08: 8f84ea400998 R09: 0001
[   57.032490] R10:  R11:  R12: 0006
[   57.040287] R13: 8f84ea422a90 R14: 8f84b4129a20 R15: fff2
[   57.048105] FS:  7f550d885740() GS:8f850960() 
knlGS:
[   57.056979] CS:  0010 DS:  ES:  CR0: 80050033
[   57.063260] CR2: 8f84a870a560 CR3: 0007e5144001 CR4: 003706e0
[   57.071053] DR0:  DR1:  DR2: 
[   57.078849] DR3:  DR6: fffe0ff0 DR7: 0400
[   57.086684] Call Trace:
[   57.089381]  drm_atomic_private_obj_fini+0x29/0x82 [drm]
[   57.095247]  amdgpu_dm_fini+0x83/0x170 [amdgpu]
[   57.100264]  dm_hw_fini+0x23/0x30 [amdgpu]
[   57.104814]  amdgpu_device_fini+0x1df/0x4fe [amdgpu]
[   57.110271]  amdgpu_driver_unload_kms+0x43/0x70 [amdgpu]
[   57.116136]  amdgpu_pci_remove+0x3b/0x60 [amdgpu]
[   57.121291]  pci_device_remove+0x3e/0xb0
[   57.125583]  device_release_driver_internal+0xff/0x1d0
[   57.131223]  device_release_driver+0x12/0x20
[   57.135903]  pci_stop_bus_device+0x70/0xa0
[   57.140401]  pci_stop_and_remove_bus_device_locked+0x1b/0x30
[   57.146571]  remove_store+0x7b/0x90
[   57.150429]  dev_attr_store+0x17/0x30
[   57.154441]  sysfs_kf_write+0x4b/0x60
[   57.158479]  kernfs_fop_write+0xe8/0x1d0
[   57.162788]  vfs_write+0xf5/0x230
[   57.166426]  ksys_write+0x70/0xf0
[   57.170087]  __x64_sys_write+0x1a/0x20
[   57.174219]  do_syscall_64+0x38/0x90
[   57.178145]  entry_SYSCALL_64_after_hwframe+0x44/0xa9

Signed-off-by: xinhui pan 
---
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index f52533ee7372..cb624ee70545 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -5076,6 +5076,7 @@ static void amdgpu_dm_connector_destroy(struct 
drm_connector *connector)  struct amdgpu_device *adev = 
drm_to_adev(connector->dev);  struct amdgpu_display_manager *dm = &adev->dm;

+drm_atomic_private_obj_fini(&aconnector->mst_mgr.base);
 #if defined(CONFIG_BACKLIGHT_CLASS_DEVICE) ||\
 defined(CONFIG_BACKLIGHT_CLASS_DEVICE_MODULE)

--
2.25.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&data=02%7C01%7CFeifei.Xu%40amd.com%7C39a1103c3dfc4794778c08d84e4cb1c2%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637345438618002415&sdata=WXlcZfzbfIKcrcCR8DQoT2GerjWbT0MorFil%2FP3LCAA%3D&reserved=0
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

RE: [PATCH] drm/amd/display: fix return value check for hdcp_work

2020-09-23 Thread Xu, Feifei

[AMD Official Use Only - Internal Distribution Only]

Reviewed-by: Feifei Xu 

-Original Message-
From: amd-gfx  On Behalf Of Flora Cui
Sent: Wednesday, September 23, 2020 2:54 PM
To: amd-gfx@lists.freedesktop.org
Cc: Cui, Flora 
Subject: [PATCH] drm/amd/display: fix return value check for hdcp_work

max_caps might be 0, thus hdcp_work might be ZERO_SIZE_PTR

Signed-off-by: Flora Cui 
---
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_hdcp.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_hdcp.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_hdcp.c
index 694c5bc93665..c2cd184f0bbd 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_hdcp.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_hdcp.c
@@ -604,7 +604,7 @@ struct hdcp_workqueue *hdcp_create_workqueue(struct 
amdgpu_device *adev, struct
 int i = 0;

 hdcp_work = kcalloc(max_caps, sizeof(*hdcp_work), GFP_KERNEL);
-if (hdcp_work == NULL)
+if (ZERO_OR_NULL_PTR(hdcp_work))
 return NULL;

 hdcp_work->srm = kcalloc(PSP_HDCP_SRM_FIRST_GEN_MAX_SIZE, 
sizeof(*hdcp_work->srm), GFP_KERNEL);
--
2.25.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&data=02%7C01%7CFeifei.Xu%40amd.com%7Ca62e4acb342a497ae9b708d85f8d8188%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637364408659753568&sdata=Nd%2FAYwpHlCtA9OH1jwwIo8FySKOMVUQD7GyuIsER%2F4g%3D&reserved=0
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

RE: [PATCH] drm/amdgpu: simplify the sdma 4_x MGCG/MGLS logic.

2021-03-03 Thread Xu, Feifei

[AMD Official Use Only - Internal Distribution Only]

Thanks. The VegaM still need to be rule out.

Thanks,
Feifei

-Original Message-
From: Alex Deucher 
Sent: Thursday, March 4, 2021 12:12 PM
To: Xu, Feifei 
Cc: amd-gfx list 
Subject: Re: [PATCH] drm/amdgpu: simplify the sdma 4_x MGCG/MGLS logic.

On Wed, Mar 3, 2021 at 10:58 PM Feifei Xu  wrote:
>
> SDMA 4_x asics share the same MGCG/MGLS setting.
>
> Signed-off-by: Feifei Xu 
> ---
>  drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c | 12 +---
>  1 file changed, 1 insertion(+), 11 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
> b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
> index 3bede8a70d7e..f46169c048fd 100644
> --- a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
> @@ -2271,21 +2271,11 @@ static int sdma_v4_0_set_clockgating_state(void 
> *handle,
> if (amdgpu_sriov_vf(adev))
> return 0;
>
> -   switch (adev->asic_type) {
> -   case CHIP_VEGA10:
> -   case CHIP_VEGA12:
> -   case CHIP_VEGA20:
> -   case CHIP_RAVEN:
> -   case CHIP_ARCTURUS:
> -   case CHIP_RENOIR:
> -   case CHIP_ALDEBARAN:
> +   if (adev->asic_type >= CHIP_VEGA10){

Need a space between ) and {.  That said, do we even need to check the asic 
type here at all?  I think this applies to all chips that have sdma4.

Alex

> sdma_v4_0_update_medium_grain_clock_gating(adev,
> state == AMD_CG_STATE_GATE);
> sdma_v4_0_update_medium_grain_light_sleep(adev,
> state == AMD_CG_STATE_GATE);
> -   break;
> -   default:
> -   break;
> }
> return 0;
>  }
> --
> 2.25.1
>
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flist
> s.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&data=04%7C01%7CFe
> ifei.Xu%40amd.com%7C67eba26e9d7a4ea88e9b08d8dec3af22%7C3dd8961fe4884e6
> 08e11a82d994e183d%7C0%7C0%7C637504279325196042%7CUnknown%7CTWFpbGZsb3d
> 8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C
> 1000&sdata=jUa2v%2BB6NICmTSr9Zdt0MQdjd1oIXYOzDYloTzUstz0%3D&re
> served=0
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

RE: [PATCH] drm/amdgpu: simplify the sdma 4_x MGCG/MGLS logic.

2021-03-03 Thread Xu, Feifei

[AMD Official Use Only - Internal Distribution Only]

Modified the coding style error. And resend. Thanks.

Thanks,
Feifei

-Original Message-
From: Feifei Xu 
Sent: Thursday, March 4, 2021 12:54 PM
To: amd-gfx@lists.freedesktop.org; alexdeuc...@gmail.com
Cc: Deucher, Alexander ; Xu, Feifei 

Subject: [PATCH] drm/amdgpu: simplify the sdma 4_x MGCG/MGLS logic.

SDMA 4_x asics share the same MGCG/MGLS setting.

Signed-off-by: Feifei Xu 
---
 drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c | 12 +---
 1 file changed, 1 insertion(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c 
b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
index 3bede8a70d7e..0280e8f589d1 100644
--- a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
@@ -2271,21 +2271,11 @@ static int sdma_v4_0_set_clockgating_state(void *handle,
 if (amdgpu_sriov_vf(adev))
 return 0;

-switch (adev->asic_type) {
-case CHIP_VEGA10:
-case CHIP_VEGA12:
-case CHIP_VEGA20:
-case CHIP_RAVEN:
-case CHIP_ARCTURUS:
-case CHIP_RENOIR:
-case CHIP_ALDEBARAN:
+if (adev->asic_type >= CHIP_VEGA10) {
 sdma_v4_0_update_medium_grain_clock_gating(adev,
 state == AMD_CG_STATE_GATE);
 sdma_v4_0_update_medium_grain_light_sleep(adev,
 state == AMD_CG_STATE_GATE);
-break;
-default:
-break;
 }
 return 0;
 }
--
2.25.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

RE: [PATCH] drm/amdgpu: simplify the sdma 4_x MGCG/MGLS logic.

2021-03-03 Thread Xu, Feifei

[AMD Official Use Only - Internal Distribution Only]

OK. Thanks for pointing it out. I will modify to remove the check.

Thanks,
Feifei

-Original Message-
From: Alex Deucher 
Sent: Thursday, March 4, 2021 1:20 PM
To: Xu, Feifei 
Cc: amd-gfx list 
Subject: Re: [PATCH] drm/amdgpu: simplify the sdma 4_x MGCG/MGLS logic.

On Wed, Mar 3, 2021 at 11:44 PM Xu, Feifei  wrote:
>
> [AMD Official Use Only - Internal Distribution Only]
>
> Thanks. The VegaM still need to be rule out.

VegaM is SDMA 3.x.

Alex

>
> Thanks,
> Feifei
>
> -Original Message-
> From: Alex Deucher 
> Sent: Thursday, March 4, 2021 12:12 PM
> To: Xu, Feifei 
> Cc: amd-gfx list 
> Subject: Re: [PATCH] drm/amdgpu: simplify the sdma 4_x MGCG/MGLS logic.
>
> On Wed, Mar 3, 2021 at 10:58 PM Feifei Xu  wrote:
> >
> > SDMA 4_x asics share the same MGCG/MGLS setting.
> >
> > Signed-off-by: Feifei Xu 
> > ---
> >  drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c | 12 +---
> >  1 file changed, 1 insertion(+), 11 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
> > b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
> > index 3bede8a70d7e..f46169c048fd 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
> > @@ -2271,21 +2271,11 @@ static int sdma_v4_0_set_clockgating_state(void 
> > *handle,
> > if (amdgpu_sriov_vf(adev))
> > return 0;
> >
> > -   switch (adev->asic_type) {
> > -   case CHIP_VEGA10:
> > -   case CHIP_VEGA12:
> > -   case CHIP_VEGA20:
> > -   case CHIP_RAVEN:
> > -   case CHIP_ARCTURUS:
> > -   case CHIP_RENOIR:
> > -   case CHIP_ALDEBARAN:
> > +   if (adev->asic_type >= CHIP_VEGA10){
>
> Need a space between ) and {.  That said, do we even need to check the asic 
> type here at all?  I think this applies to all chips that have sdma4.
>
> Alex
>
> > sdma_v4_0_update_medium_grain_clock_gating(adev,
> > state == AMD_CG_STATE_GATE);
> > sdma_v4_0_update_medium_grain_light_sleep(adev,
> > state == AMD_CG_STATE_GATE);
> > -   break;
> > -   default:
> > -   break;
> > }
> > return 0;
> >  }
> > --
> > 2.25.1
> >
> > ___
> > amd-gfx mailing list
> > amd-gfx@lists.freedesktop.org
> > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fli
> > st
> > s.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&data=04%7C01%7C
> > Fe
> > ifei.Xu%40amd.com%7C67eba26e9d7a4ea88e9b08d8dec3af22%7C3dd8961fe4884
> > e6
> > 08e11a82d994e183d%7C0%7C0%7C637504279325196042%7CUnknown%7CTWFpbGZsb
> > 3d
> > 8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%
> > 7C
> > 1000&sdata=jUa2v%2BB6NICmTSr9Zdt0MQdjd1oIXYOzDYloTzUstz0%3D&
> > re
> > served=0
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

RE: [PATCH] drm/amdgpu: simplify the sdma 4_x MGCG/MGLS logic.

2021-03-03 Thread Xu, Feifei

Thanks. Will modify to remove the check since all SDMA 4_x share the same 
setting logic.

Thanks,
Feifei

-Original Message-
From: Lazar, Lijo  
Sent: Thursday, March 4, 2021 1:37 PM
To: Alex Deucher ; Xu, Feifei 
Cc: amd-gfx list 
Subject: RE: [PATCH] drm/amdgpu: simplify the sdma 4_x MGCG/MGLS logic.

[AMD Public Use]

There shouldn't be any check based on ASIC type. If a check is required, it 
should be based on  AMD_CG_SUPPORT_SDMA_MGCG and AMD_CG_SUPPORT_SDMA_LS. We set 
the flags appropriately for each ASIC in soc15.

Thanks,
Lijo

-Original Message-
From: amd-gfx  On Behalf Of Alex Deucher
Sent: Thursday, March 4, 2021 10:50 AM
To: Xu, Feifei 
Cc: amd-gfx list 
Subject: Re: [PATCH] drm/amdgpu: simplify the sdma 4_x MGCG/MGLS logic.

On Wed, Mar 3, 2021 at 11:44 PM Xu, Feifei  wrote:
>
> [AMD Official Use Only - Internal Distribution Only]
>
> Thanks. The VegaM still need to be rule out.

VegaM is SDMA 3.x.

Alex

>
> Thanks,
> Feifei
>
> -Original Message-
> From: Alex Deucher 
> Sent: Thursday, March 4, 2021 12:12 PM
> To: Xu, Feifei 
> Cc: amd-gfx list 
> Subject: Re: [PATCH] drm/amdgpu: simplify the sdma 4_x MGCG/MGLS logic.
>
> On Wed, Mar 3, 2021 at 10:58 PM Feifei Xu  wrote:
> >
> > SDMA 4_x asics share the same MGCG/MGLS setting.
> >
> > Signed-off-by: Feifei Xu 
> > ---
> >  drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c | 12 +---
> >  1 file changed, 1 insertion(+), 11 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
> > b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
> > index 3bede8a70d7e..f46169c048fd 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
> > @@ -2271,21 +2271,11 @@ static int sdma_v4_0_set_clockgating_state(void 
> > *handle,
> > if (amdgpu_sriov_vf(adev))
> > return 0;
> >
> > -   switch (adev->asic_type) {
> > -   case CHIP_VEGA10:
> > -   case CHIP_VEGA12:
> > -   case CHIP_VEGA20:
> > -   case CHIP_RAVEN:
> > -   case CHIP_ARCTURUS:
> > -   case CHIP_RENOIR:
> > -   case CHIP_ALDEBARAN:
> > +   if (adev->asic_type >= CHIP_VEGA10){
>
> Need a space between ) and {.  That said, do we even need to check the asic 
> type here at all?  I think this applies to all chips that have sdma4.
>
> Alex
>
> > sdma_v4_0_update_medium_grain_clock_gating(adev,
> > state == AMD_CG_STATE_GATE);
> > sdma_v4_0_update_medium_grain_light_sleep(adev,
> > state == AMD_CG_STATE_GATE);
> > -   break;
> > -   default:
> > -   break;
> > }
> > return 0;
> >  }
> > --
> > 2.25.1
> >
> > ___
> > amd-gfx mailing list
> > amd-gfx@lists.freedesktop.org
> > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fli
> > st 
> > s.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&data=04%7C01%7C
> > Fe
> > ifei.Xu%40amd.com%7C67eba26e9d7a4ea88e9b08d8dec3af22%7C3dd8961fe4884
> > e6 
> > 08e11a82d994e183d%7C0%7C0%7C637504279325196042%7CUnknown%7CTWFpbGZsb
> > 3d 
> > 8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%
> > 7C 
> > 1000&sdata=jUa2v%2BB6NICmTSr9Zdt0MQdjd1oIXYOzDYloTzUstz0%3D&
> > re
> > served=0
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&data=04%7C01%7Clijo.lazar%40amd.com%7Cbfbac27bc87349944bb208d8decd3447%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637504320239632738%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=LJ5n33cyVrDmUCl%2FrJYUUtYP4RKP3tIiS1FKOSqdwyM%3D&reserved=0
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

RE: [PATCH] drm/amdgpu: add sdma 4_x interrupts printing

2021-03-03 Thread Xu, Feifei

Thanks. Will modify like this:
if (instance < 0 || instance > adev->sdma.num_instances) {

Thanks,
Feifei

-Original Message-
From: Zhang, Hawking  
Sent: Thursday, March 4, 2021 2:54 PM
To: Xu, Feifei ; amd-gfx@lists.freedesktop.org
Cc: Xu, Feifei 
Subject: RE: [PATCH] drm/amdgpu: add sdma 4_x interrupts printing

[AMD Public Use]

+   if (instance < 0 || instance > 7){

Please check sdma.num_instances for the maximum instance, instead of hard coded 
to 7. The SDMA instance numbers vary from ASIC to ASIC.

With above fixed, the patch is

Reviewed-by: Hawking Zhang 

Regards,
Hawking
-Original Message-
From: Feifei Xu  
Sent: Thursday, March 4, 2021 11:45
To: amd-gfx@lists.freedesktop.org
Cc: Zhang, Hawking ; Xu, Feifei 
Subject: [PATCH] drm/amdgpu: add sdma 4_x interrupts printing

Add VM_HOLE/DOORBELL_INVALID_BE/POLL_TIMEOUT/SRBMWRITE
interrupt info printing.

Signed-off-by: Feifei Xu 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.h |   5 +
 drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c   | 119 +++
 2 files changed, 124 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.h
index e5b8fb8e75c5..f8fb755e3aa6 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.h
@@ -64,6 +64,11 @@ struct amdgpu_sdma {
struct amdgpu_irq_src   trap_irq;
struct amdgpu_irq_src   illegal_inst_irq;
struct amdgpu_irq_src   ecc_irq;
+   struct amdgpu_irq_src   vm_hole_irq;
+   struct amdgpu_irq_src   doorbell_invalid_irq;
+   struct amdgpu_irq_src   pool_timeout_irq;
+   struct amdgpu_irq_src   srbm_write_irq;
+
int num_instances;
uint32_tsrbm_soft_reset;
boolhas_page_queue;
diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c 
b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
index 3bede8a70d7e..3305b8ec5025 100644
--- a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
@@ -1944,6 +1944,33 @@ static int sdma_v4_0_sw_init(void *handle)
return r;
}
 
+   /* SDMA VM_HOLE/DOORBELL_INV/POLL_TIMEOUT/SRBM_WRITE_PROTECTION event*/
+   for (i = 0; i < adev->sdma.num_instances; i++) {
+   r = amdgpu_irq_add_id(adev, sdma_v4_0_seq_to_irq_id(i),
+ SDMA0_4_0__SRCID__SDMA_VM_HOLE,
+ &adev->sdma.vm_hole_irq);
+   if (r)
+   return r;
+
+   r = amdgpu_irq_add_id(adev, sdma_v4_0_seq_to_irq_id(i),
+ SDMA0_4_0__SRCID__SDMA_DOORBELL_INVALID,
+ &adev->sdma.doorbell_invalid_irq);
+   if (r)
+   return r;
+
+   r = amdgpu_irq_add_id(adev, sdma_v4_0_seq_to_irq_id(i),
+ SDMA0_4_0__SRCID__SDMA_POLL_TIMEOUT,
+ &adev->sdma.pool_timeout_irq);
+   if (r)
+   return r;
+
+   r = amdgpu_irq_add_id(adev, sdma_v4_0_seq_to_irq_id(i),
+ SDMA0_4_0__SRCID__SDMA_SRBMWRITE,
+ &adev->sdma.srbm_write_irq);
+   if (r)
+   return r;
+   }
+
for (i = 0; i < adev->sdma.num_instances; i++) {
ring = &adev->sdma.instance[i].ring;
ring->ring_obj = NULL;
@@ -2198,6 +2225,72 @@ static int sdma_v4_0_set_ecc_irq_state(struct 
amdgpu_device *adev,
return 0;
 }
 
+static int sdma_v4_0_print_iv_entry(struct amdgpu_device *adev,
+ struct amdgpu_iv_entry *entry) {
+   int instance;
+   struct amdgpu_task_info task_info;
+   u64 addr;
+   instance = sdma_v4_0_irq_id_to_seq(entry->client_id);
+   if (instance < 0 || instance > 7){
+   dev_err(adev->dev, "sdma instance invalid %d\n", instance);
+   return -EINVAL;
+   }
+
+   addr = (u64)entry->src_data[0] << 12;
+   addr |= ((u64)entry->src_data[1] & 0xf) << 44;
+
+   memset(&task_info, 0, sizeof(struct amdgpu_task_info));
+   amdgpu_vm_get_task_info(adev, entry->pasid, &task_info);
+
+   dev_info(adev->dev,
+  "[sdma%d] address:0x%016llx src_id:%u ring:%u vmid:%u "
+  "pasid:%u, for process %s pid %d thread %s pid %d\n",
+  instance, addr, entry->src_id, entry->ring_id, entry->vmid,
+  entry->pasid, task_info.process_name, task_info.tgid,
+  task_info.task_name, task_info.pid);
+   return 0;
+}
+
+static int sdma_v4_0_process_vm_hole_irq(struct amdgpu_device *ade

RE: [PATCH] drm/amdgpu: add sdma 4_x interrupts printing

2021-03-03 Thread Xu, Feifei

[AMD Official Use Only - Internal Distribution Only]

1) Fix the hardcode as:
if (instance < 0 || instance >= adev->sdma.num_instances) {

2) Fix some coding style error/warning

Thanks,
Feifei

-Original Message-
From: Feifei Xu 
Sent: Thursday, March 4, 2021 3:40 PM
To: amd-gfx@lists.freedesktop.org
Cc: Zhang, Hawking ; Xu, Feifei 
Subject: [PATCH] drm/amdgpu: add sdma 4_x interrupts printing

Add VM_HOLE/DOORBELL_INVALID_BE/POLL_TIMEOUT/SRBMWRITE
interrupt info printing.

Signed-off-by: Feifei Xu 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.h |   5 +
 drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c   | 119 +++
 2 files changed, 124 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.h
index e5b8fb8e75c5..f8fb755e3aa6 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.h
@@ -64,6 +64,11 @@ struct amdgpu_sdma {
 struct amdgpu_irq_srctrap_irq;
 struct amdgpu_irq_srcillegal_inst_irq;
 struct amdgpu_irq_srcecc_irq;
+struct amdgpu_irq_srcvm_hole_irq;
+struct amdgpu_irq_srcdoorbell_invalid_irq;
+struct amdgpu_irq_srcpool_timeout_irq;
+struct amdgpu_irq_srcsrbm_write_irq;
+
 intnum_instances;
 uint32_tsrbm_soft_reset;
 boolhas_page_queue;
diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c 
b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
index 70d247841d14..bcf3d62e3cb8 100644
--- a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
@@ -1944,6 +1944,33 @@ static int sdma_v4_0_sw_init(void *handle)
 return r;
 }

+/* SDMA VM_HOLE/DOORBELL_INV/POLL_TIMEOUT/SRBM_WRITE_PROTECTION event*/
+for (i = 0; i < adev->sdma.num_instances; i++) {
+r = amdgpu_irq_add_id(adev, sdma_v4_0_seq_to_irq_id(i),
+  SDMA0_4_0__SRCID__SDMA_VM_HOLE,
+  &adev->sdma.vm_hole_irq);
+if (r)
+return r;
+
+r = amdgpu_irq_add_id(adev, sdma_v4_0_seq_to_irq_id(i),
+  SDMA0_4_0__SRCID__SDMA_DOORBELL_INVALID,
+  &adev->sdma.doorbell_invalid_irq);
+if (r)
+return r;
+
+r = amdgpu_irq_add_id(adev, sdma_v4_0_seq_to_irq_id(i),
+  SDMA0_4_0__SRCID__SDMA_POLL_TIMEOUT,
+  &adev->sdma.pool_timeout_irq);
+if (r)
+return r;
+
+r = amdgpu_irq_add_id(adev, sdma_v4_0_seq_to_irq_id(i),
+  SDMA0_4_0__SRCID__SDMA_SRBMWRITE,
+  &adev->sdma.srbm_write_irq);
+if (r)
+return r;
+}
+
 for (i = 0; i < adev->sdma.num_instances; i++) {
 ring = &adev->sdma.instance[i].ring;
 ring->ring_obj = NULL;
@@ -2198,6 +2225,72 @@ static int sdma_v4_0_set_ecc_irq_state(struct 
amdgpu_device *adev,
 return 0;
 }

+static int sdma_v4_0_print_iv_entry(struct amdgpu_device *adev,
+  struct amdgpu_iv_entry *entry) {
+int instance;
+struct amdgpu_task_info task_info;
+u64 addr;
+
+instance = sdma_v4_0_irq_id_to_seq(entry->client_id);
+if (instance < 0 || instance >= adev->sdma.num_instances) {
+dev_err(adev->dev, "sdma instance invalid %d\n", instance);
+return -EINVAL;
+}
+
+addr = (u64)entry->src_data[0] << 12;
+addr |= ((u64)entry->src_data[1] & 0xf) << 44;
+
+memset(&task_info, 0, sizeof(struct amdgpu_task_info));
+amdgpu_vm_get_task_info(adev, entry->pasid, &task_info);
+
+dev_info(adev->dev,
+   "[sdma%d] address:0x%016llx src_id:%u ring:%u vmid:%u "
+   "pasid:%u, for process %s pid %d thread %s pid %d\n",
+   instance, addr, entry->src_id, entry->ring_id, entry->vmid,
+   entry->pasid, task_info.process_name, task_info.tgid,
+   task_info.task_name, task_info.pid);
+return 0;
+}
+
+static int sdma_v4_0_process_vm_hole_irq(struct amdgpu_device *adev,
+  struct amdgpu_irq_src *source,
+  struct amdgpu_iv_entry *entry) {
+dev_err(adev->dev, "MC or SEM address in VM hole\n");
+sdma_v4_0_print_iv_entry(adev, entry);
+return 0;
+}
+
+static int sdma_v4_0_process_doorbell_invalid_irq(struct amdgpu_device *adev,
+  struct amdgpu_irq_src *source,
+  struct amdgpu_iv_entry *entry) {
+dev_err(adev->dev, "SDMA received a doorbell from BIF with byte_enable 
!=0xff\n");
+sdma_v4_0_print_iv_entry(adev, entry);
+return 0;
+}
+
+static int sdma_v4_0_process_pool_timeout_irq(struct amdgpu_device *adev,
+  struct amdgpu_irq_src *source,
+  struct amdgpu_iv_entry *entry) {
+dev_err(adev->dev,
+"Polling register/memory timeout executing POLL_REG/MEM with finite timer\n");
+sdma_v4_0_print_iv_entry(adev, entry);
+return 0;
+}
+
+static int sdma_v4_0_process_srbm_write_irq(struct amdgpu_device *adev,
+  struct amdgpu_irq_src *source,
+  struct amdgpu_iv_entry *entry) {
+dev_err(adev->dev,
+"SDMA gets an Register Write SRBM_WRITE command in non-privilege command 
buffer\n");
+sdma_v4_0_print_iv_entry(adev, entry);
+return 0;
+}
+
 static void sdma_v4_0_update_medium_grain_clock_gating(
 struct amdgpu_device *adev,
 bool enable)
@@ -2503,7 +2596,2

RE: [PATCH] drm/amdgpu: soc15 pcie gen4 support

2021-03-04 Thread Xu, Feifei

Yes, seems un-necessary in current implementation.
 
Thanks,
Feifei

-Original Message-
From: Lazar, Lijo  
Sent: Thursday, March 4, 2021 3:38 PM
To: Chen, Guchun ; Alex Deucher ; 
Xu, Feifei 
Cc: amd-gfx list ; Zhang, Hawking 

Subject: RE: [PATCH] drm/amdgpu: soc15 pcie gen4 support

[AMD Public Use]

This function is doing nothing, is it necessary to maintain it? Not sure, if 
pcie gen support needs to be enabled specifically in driver.

Thanks,
Lijo

-Original Message-
From: amd-gfx  On Behalf Of Chen, Guchun
Sent: Thursday, March 4, 2021 1:06 PM
To: Alex Deucher ; Xu, Feifei 
Cc: amd-gfx list ; Zhang, Hawking 

Subject: RE: [PATCH] drm/amdgpu: soc15 pcie gen4 support

[AMD Public Use]

How about module parameter check amdgpu_pcie_gen2 in soc15_pcie_gen4_enable? Is 
it necessary to modify it as well?

Regards,
Guchun

-Original Message-
From: amd-gfx  On Behalf Of Alex Deucher
Sent: Thursday, March 4, 2021 11:58 AM
To: Xu, Feifei 
Cc: amd-gfx list ; Zhang, Hawking 

Subject: Re: [PATCH] drm/amdgpu: soc15 pcie gen4 support

On Wed, Mar 3, 2021 at 10:46 PM Feifei Xu  wrote:
>
> Signed-off-by: Feifei Xu 

Acked-by: Alex Deucher 

> ---
>  drivers/gpu/drm/amd/amdgpu/soc15.c | 9 +
>  1 file changed, 5 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/soc15.c 
> b/drivers/gpu/drm/amd/amdgpu/soc15.c
> index 28b991904eaa..437cdc56bdc5 100644
> --- a/drivers/gpu/drm/amd/amdgpu/soc15.c
> +++ b/drivers/gpu/drm/amd/amdgpu/soc15.c
> @@ -569,7 +569,7 @@ static int soc15_set_vce_clocks(struct amdgpu_device 
> *adev, u32 evclk, u32 ecclk
> return 0;
>  }
>
> -static void soc15_pcie_gen3_enable(struct amdgpu_device *adev)
> +static void soc15_pcie_gen4_enable(struct amdgpu_device *adev)
>  {
> if (pci_is_root_bus(adev->pdev->bus))
> return;
> @@ -581,7 +581,8 @@ static void soc15_pcie_gen3_enable(struct amdgpu_device 
> *adev)
> return;
>
> if (!(adev->pm.pcie_gen_mask & (CAIL_PCIE_LINK_SPEED_SUPPORT_GEN2 |
> -   CAIL_PCIE_LINK_SPEED_SUPPORT_GEN3)))
> +   CAIL_PCIE_LINK_SPEED_SUPPORT_GEN3 |
> +   
> + CAIL_PCIE_LINK_SPEED_SUPPORT_GEN4)))
> return;
>
> /* todo */
> @@ -1374,8 +1375,8 @@ static int soc15_common_hw_init(void *handle)  {
> struct amdgpu_device *adev = (struct amdgpu_device *)handle;
>
> -   /* enable pcie gen2/3 link */
> -   soc15_pcie_gen3_enable(adev);
> +   /* enable pcie gen2/3/4 link */
> +   soc15_pcie_gen4_enable(adev);
> /* enable aspm */
> soc15_program_aspm(adev);
> /* setup nbio registers */
> --
> 2.25.1
>
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flist
> s.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&data=04%7C01%7Cgu
> chun.chen%40amd.com%7Ca2af451ebb4b43bcc27b08d8dec1b4b5%7C3dd8961fe4884
> e608e11a82d994e183d%7C0%7C0%7C637504270832366776%7CUnknown%7CTWFpbGZsb
> 3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%
> 7C1000&sdata=jaNGe06yp8RXKQxMm%2F%2FdR%2F6DMC4h8viG3KUW3Sz2s7Y%3D&
> amp;reserved=0
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&data=04%7C01%7Clijo.lazar%40amd.com%7C8f330aec736c4dfe8b3d08d8dee01c35%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637504401425290833%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=9r5B6ckebE7FSvUHnZPKplh5c7fDMiK57U2RPnYSeM0%3D&reserved=0
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&data=04%7C01%7Clijo.lazar%40amd.com%7C8f330aec736c4dfe8b3d08d8dee01c35%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637504401425290833%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=9r5B6ckebE7FSvUHnZPKplh5c7fDMiK57U2RPnYSeM0%3D&reserved=0
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

RE: [PATCH] drm/amd/pm: remove duplicate XGMI feature mask

2021-03-04 Thread Xu, Feifei

[AMD Official Use Only - Internal Distribution Only]

Series is Acked-by: Feifei Xu 

-Original Message-
From: amd-gfx  On Behalf Of Kevin Wang
Sent: Thursday, March 4, 2021 3:35 PM
To: amd-gfx@lists.freedesktop.org
Cc: Wang, Kevin(Yang) 
Subject: [PATCH] drm/amd/pm: remove duplicate XGMI feature mask

replace SMU feature XGMI with XGMI_DPM.
it will cause show to be incorrect in pp_features node.

Signed-off-by: Kevin Wang 
---
 drivers/gpu/drm/amd/pm/inc/smu_types.h| 1 -
 drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c | 2 +-
 2 files changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/pm/inc/smu_types.h 
b/drivers/gpu/drm/amd/pm/inc/smu_types.h
index aa4822202587..f9f45b6764fa 100644
--- a/drivers/gpu/drm/amd/pm/inc/smu_types.h
+++ b/drivers/gpu/drm/amd/pm/inc/smu_types.h
@@ -282,7 +282,6 @@ enum smu_clk_type {
__SMU_DUMMY_MAP(DS_FCLK),   \
__SMU_DUMMY_MAP(DS_MP1CLK), \
__SMU_DUMMY_MAP(DS_MP0CLK), \
-   __SMU_DUMMY_MAP(XGMI),  \
__SMU_DUMMY_MAP(DPM_GFX_PACE),  \
__SMU_DUMMY_MAP(MEM_VDDCI_SCALING), \
__SMU_DUMMY_MAP(MEM_MVDD_SCALING),  \
diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c 
b/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c
index f76d1b8aeecc..8189457a3ae6 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c
@@ -162,7 +162,7 @@ static const struct cmn2asic_mapping 
arcturus_feature_mask_map[SMU_FEATURE_COUNT
 FEA_MAP(DPM_SOCCLK),
 FEA_MAP(DPM_FCLK),
 FEA_MAP(DPM_MP0CLK),
-ARCTURUS_FEA_MAP(SMU_FEATURE_XGMI_BIT, FEATURE_DPM_XGMI_BIT),
+FEA_MAP(DPM_XGMI),
 FEA_MAP(DS_GFXCLK),
 FEA_MAP(DS_SOCCLK),
 FEA_MAP(DS_LCLK),
--
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&data=04%7C01%7CFeifei.Xu%40amd.com%7Cda4d4b204a46456f7fec08d8dee02282%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637504401527391042%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=aWb9nQWgyQXwiNOPBMZhEsUu8XJvzpa%2FVHP7JVdzPe4%3D&reserved=0
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

RE: [PATCH] drm/amd/pm: correct the watermark settings for Polaris

2021-03-04 Thread Xu, Feifei

[AMD Official Use Only - Internal Distribution Only]

Reviewed-by: Feifei Xu 

-Original Message-
From: amd-gfx  On Behalf Of Evan Quan
Sent: Friday, March 5, 2021 2:25 PM
To: amd-gfx@lists.freedesktop.org
Cc: Deucher, Alexander ; Georgios Toptsidis 
; Quan, Evan ; Chen, Guchun 

Subject: [PATCH] drm/amd/pm: correct the watermark settings for Polaris

The "/ 10" should be applied to the right-hand operand instead of the left-hand 
one.

Change-Id: Ie730a1981aa5dee45cd6c3efccc7fb0f088cd679
Signed-off-by: Evan Quan 
Noticed-by: Georgios Toptsidis 
---
 drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu7_hwmgr.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu7_hwmgr.c 
b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu7_hwmgr.c
index c57dc9ae81f2..a2681fe875ed 100644
--- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu7_hwmgr.c
+++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu7_hwmgr.c
@@ -5216,10 +5216,10 @@ static int smu7_set_watermarks_for_clocks_ranges(struct 
pp_hwmgr *hwmgr,
 for (j = 0; j < dep_sclk_table->count; j++) {
 valid_entry = false;
 for (k = 0; k < watermarks->num_wm_sets; k++) {
-if (dep_sclk_table->entries[i].clk / 10 >= 
watermarks->wm_clk_ranges[k].wm_min_eng_clk_in_khz &&
-dep_sclk_table->entries[i].clk / 10 < 
watermarks->wm_clk_ranges[k].wm_max_eng_clk_in_khz &&
-dep_mclk_table->entries[i].clk / 10 >= 
watermarks->wm_clk_ranges[k].wm_min_mem_clk_in_khz &&
-dep_mclk_table->entries[i].clk / 10 < 
watermarks->wm_clk_ranges[k].wm_max_mem_clk_in_khz) {
+if (dep_sclk_table->entries[i].clk >= 
watermarks->wm_clk_ranges[k].wm_min_eng_clk_in_khz / 10 &&
+dep_sclk_table->entries[i].clk < 
watermarks->wm_clk_ranges[k].wm_max_eng_clk_in_khz / 10 &&
+dep_mclk_table->entries[i].clk >= 
watermarks->wm_clk_ranges[k].wm_min_mem_clk_in_khz / 10 &&
+dep_mclk_table->entries[i].clk <
+watermarks->wm_clk_ranges[k].wm_max_mem_clk_in_khz / 10) {
 valid_entry = true;
 table->DisplayWatermark[i][j] = watermarks->wm_clk_ranges[k].wm_set_id;
 break;
--
2.29.0

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&data=04%7C01%7CFeifei.Xu%40amd.com%7C629b5ae589cd43f4506f08d8df9f80d2%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637505223446156243%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=I7hiICVKpvwlZtmT302%2BLJJPo8RLa6yu37EuJDOotHg%3D&reserved=0
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

RE: [PATCH] drm/amdgpu: remove ECO_BITS programing on gmc9

2021-03-08 Thread Xu, Feifei

[AMD Public Use]

Thanks Anna. Result is good on SRIOV guest driver as well. Will push with 

Reviewed-by: Hawking Zhang 
Tested-by Anna Jin < anna@amd.com>

Thanks,
Feifei

-Original Message-
From: Zhang, Hawking  
Sent: 2021年3月5日 下午 8:51
To: Xu, Feifei ; amd-gfx@lists.freedesktop.org
Cc: Lin, Amber ; Xu, Feifei ; Jin, Anna 

Subject: RE: [PATCH] drm/amdgpu: remove ECO_BITS programing on gmc9

[AMD Public Use]

Reviewed-by: Hawking Zhang 

Per discussion, please work with Anna to identify the potential risk in SRIOV 
guest driver (VEGA10) before pushing the patch. Thanks.

Regards,
Hawking
-Original Message-
From: Feifei Xu  
Sent: Friday, March 5, 2021 17:10
To: amd-gfx@lists.freedesktop.org
Cc: Zhang, Hawking ; Lin, Amber ; Xu, 
Feifei 
Subject: [PATCH] drm/amdgpu: remove ECO_BITS programing on gmc9

Remove the ECO_BITS programing in driver on gfxhub1.0, mmhub1_x and mmhub_9.4.

Signed-off-by: Feifei Xu 
---
 drivers/gpu/drm/amd/amdgpu/gfxhub_v1_0.c | 1 -  
drivers/gpu/drm/amd/amdgpu/mmhub_v1_0.c  | 1 -  
drivers/gpu/drm/amd/amdgpu/mmhub_v1_7.c  | 1 -  
drivers/gpu/drm/amd/amdgpu/mmhub_v9_4.c  | 2 --
 4 files changed, 5 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/gfxhub_v1_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfxhub_v1_0.c
index 0ab498d93e48..0cf993797df8 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfxhub_v1_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfxhub_v1_0.c
@@ -186,7 +186,6 @@ static void gfxhub_v1_0_init_tlb_regs(struct amdgpu_device 
*adev)
ENABLE_ADVANCED_DRIVER_MODEL, 1);
tmp = REG_SET_FIELD(tmp, MC_VM_MX_L1_TLB_CNTL,
SYSTEM_APERTURE_UNMAPPED_ACCESS, 0);
-   tmp = REG_SET_FIELD(tmp, MC_VM_MX_L1_TLB_CNTL, ECO_BITS, 0);
tmp = REG_SET_FIELD(tmp, MC_VM_MX_L1_TLB_CNTL,
MTYPE, MTYPE_UC);/* XXX for emulation. */
tmp = REG_SET_FIELD(tmp, MC_VM_MX_L1_TLB_CNTL, ATC_EN, 1); diff --git 
a/drivers/gpu/drm/amd/amdgpu/mmhub_v1_0.c 
b/drivers/gpu/drm/amd/amdgpu/mmhub_v1_0.c
index 0145d4d201cf..37b985317012 100644
--- a/drivers/gpu/drm/amd/amdgpu/mmhub_v1_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/mmhub_v1_0.c
@@ -164,7 +164,6 @@ static void mmhub_v1_0_init_tlb_regs(struct amdgpu_device 
*adev)
ENABLE_ADVANCED_DRIVER_MODEL, 1);
tmp = REG_SET_FIELD(tmp, MC_VM_MX_L1_TLB_CNTL,
SYSTEM_APERTURE_UNMAPPED_ACCESS, 0);
-   tmp = REG_SET_FIELD(tmp, MC_VM_MX_L1_TLB_CNTL, ECO_BITS, 0);
tmp = REG_SET_FIELD(tmp, MC_VM_MX_L1_TLB_CNTL,
MTYPE, MTYPE_UC);/* XXX for emulation. */
tmp = REG_SET_FIELD(tmp, MC_VM_MX_L1_TLB_CNTL, ATC_EN, 1); diff --git 
a/drivers/gpu/drm/amd/amdgpu/mmhub_v1_7.c 
b/drivers/gpu/drm/amd/amdgpu/mmhub_v1_7.c
index 816ff110a074..9099162553a4 100644
--- a/drivers/gpu/drm/amd/amdgpu/mmhub_v1_7.c
+++ b/drivers/gpu/drm/amd/amdgpu/mmhub_v1_7.c
@@ -189,7 +189,6 @@ static void mmhub_v1_7_init_tlb_regs(struct amdgpu_device 
*adev)
ENABLE_ADVANCED_DRIVER_MODEL, 1);
tmp = REG_SET_FIELD(tmp, MC_VM_MX_L1_TLB_CNTL,
SYSTEM_APERTURE_UNMAPPED_ACCESS, 0);
-   tmp = REG_SET_FIELD(tmp, MC_VM_MX_L1_TLB_CNTL, ECO_BITS, 0);
tmp = REG_SET_FIELD(tmp, MC_VM_MX_L1_TLB_CNTL,
MTYPE, MTYPE_UC);/* XXX for emulation. */
tmp = REG_SET_FIELD(tmp, MC_VM_MX_L1_TLB_CNTL, ATC_EN, 1); diff --git 
a/drivers/gpu/drm/amd/amdgpu/mmhub_v9_4.c 
b/drivers/gpu/drm/amd/amdgpu/mmhub_v9_4.c
index 65fb88d391d3..d68f3cd2d40d 100644
--- a/drivers/gpu/drm/amd/amdgpu/mmhub_v9_4.c
+++ b/drivers/gpu/drm/amd/amdgpu/mmhub_v9_4.c
@@ -219,8 +219,6 @@ static void mmhub_v9_4_init_tlb_regs(struct amdgpu_device 
*adev, int hubid)
ENABLE_ADVANCED_DRIVER_MODEL, 1);
tmp = REG_SET_FIELD(tmp, VMSHAREDVC0_MC_VM_MX_L1_TLB_CNTL,
SYSTEM_APERTURE_UNMAPPED_ACCESS, 0);
-   tmp = REG_SET_FIELD(tmp, VMSHAREDVC0_MC_VM_MX_L1_TLB_CNTL,
-   ECO_BITS, 0);
tmp = REG_SET_FIELD(tmp, VMSHAREDVC0_MC_VM_MX_L1_TLB_CNTL,
MTYPE, MTYPE_UC);/* XXX for emulation. */
tmp = REG_SET_FIELD(tmp, VMSHAREDVC0_MC_VM_MX_L1_TLB_CNTL,
--
2.25.1
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

RE: [PATCH] drm/amdgpu: skip read eeprom for device that pending on XGMI reset

2021-03-09 Thread Xu, Feifei

[AMD Official Use Only - Internal Distribution Only]



Reviewed-by: Feifei Xu 

-Original Message-
From: amd-gfx  On Behalf Of shaoyunl
Sent: 2021年3月10日 上午 9:27
To: amd-gfx@lists.freedesktop.org
Cc: Liu, Shaoyun 
Subject: [PATCH] drm/amdgpu: skip read eeprom for device that pending on XGMI 
reset

Read eeprom through SMU doesn't works stable on XGMI reset during test.
skip it for now

Signed-off-by: shaoyunl 
Change-Id: Id864b96a9da5b0d4dd5ffef9858997dd9f52de25
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
index c669435ccc74..a2ab8ee251f1 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
@@ -1822,6 +1822,12 @@ int amdgpu_ras_recovery_init(struct amdgpu_device *adev)
goto out;
}
 
+   /* Todo: During test the SMU might fail to read the eeprom through I2C
+* when the GPU is pending on XGMI reset during probe time
+* (Mostly after second bus reset), skip it now
+*/
+   if (adev->gmc.xgmi.pending_reset)
+   return 0;
ret = amdgpu_ras_eeprom_init(&con->eeprom_control, &exc_err_limit);
/*
 * This calling fails when exc_err_limit is true or
-- 
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&data=04%7C01%7CFeifei.Xu%40amd.com%7C0276a1df25374903a67208d8e3639832%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637509364274441634%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=4x7W4%2FLzTb8nKkryHCWr%2Bx%2BLmEtFwax6lZq7S3EWr8E%3D&reserved=0
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

RE: [PATCH 2/2] drm/amdgpu: skip query VFCT table for headless ASICs

2021-03-09 Thread Xu, Feifei

[AMD Official Use Only - Internal Distribution Only]

Hi Alex,
Hi Alex

We have a thread discussed the GOP driver with Xiong on headless SKUs. Just 
forwarded that to you as well. He has confirmed this on that thread.

There's some NV ASICs which have VCN harvested. Those parts have the VGA class 
while it is headless.
I am thinking to centralize the non-VGA subclass devices and VGA subclass but 
headless devices. So added the amdgpu_device_is_headless() function including 
NV check.

I can drop the amdgpu_device_is_headless(). And check the headless case by 
checking both VGA subclass and nv_is_headless().

Thanks,
Feifei

-Original Message-
From: Alex Deucher 
Sent: Wednesday, March 10, 2021 12:50 PM
To: Xu, Feifei 
Cc: amd-gfx list ; Zhang, Hawking 

Subject: Re: [PATCH 2/2] drm/amdgpu: skip query VFCT table for headless ASICs

On Tue, Mar 9, 2021 at 11:38 PM Feifei Xu  wrote:
>
> There will be no GOP driver to copy vbios image to VFCT table for
> headless ASICs. Thus skip VFCT.

I'm not sure these patches are entirely correct.

>
> Signed-off-by: Feifei Xu 
> Reviewed-by: Hawking Zhang 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_bios.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_bios.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_bios.c
> index f454a6bd0ed6..03739774beca 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_bios.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_bios.c
> @@ -427,7 +427,7 @@ bool amdgpu_get_bios(struct amdgpu_device *adev)
> goto success;
> }
>
> -   if (amdgpu_acpi_vfct_bios(adev)) {
> +   if (!amdgpu_device_is_headless(adev) &&
> + amdgpu_acpi_vfct_bios(adev)) {

I would drop the first patch and just check the pci class directly here, it's 
more clear what it's checking for, plus I don't know if it's a good idea to mix 
the nv check in here.

Alex

> dev_info(adev->dev, "Fetched VBIOS from VFCT\n");
> goto success;
> }
> --
> 2.25.1
>
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flist
> s.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&data=04%7C01%7CFe
> ifei.Xu%40amd.com%7C298d17c750fe4fc8529a08d8e380098f%7C3dd8961fe4884e6
> 08e11a82d994e183d%7C0%7C0%7C637509486351235872%7CUnknown%7CTWFpbGZsb3d
> 8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C
> 1000&sdata=MGfjf7kaF8bILyrllz7Dd9hSiKSMRrbswvDjAB7u%2BN4%3D&re
> served=0
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

RE: [PATCH] drm/amdgpu: Use DRM_INFO if VFCT table not valid

2021-03-15 Thread Xu, Feifei

[AMD Official Use Only - Internal Distribution Only]

Sorry, please ignore this one. I will draft a V2 to remove this one:
if (!adev->bios) {
-DRM_ERROR("Unable to allocate bios\n");
+DRM_INFO("Unable to allocate bios,skipping\n");
 return false;

Thanks,
Feifei

-Original Message-
From: Feifei Xu 
Sent: Monday, March 15, 2021 2:46 PM
To: amd-gfx@lists.freedesktop.org
Cc: Deucher, Alexander ; Lazar, Lijo 
; Zhang, Hawking ; Xu, Feifei 

Subject: [PATCH] drm/amdgpu: Use DRM_INFO if VFCT table not valid

Some ASICs has no GOP driver to copy vbios image to VFCT table.
In this case, it will go to next check.

Signed-off-by: Feifei Xu 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_bios.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_bios.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_bios.c
index f454a6bd0ed6..dde27b26a735 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_bios.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_bios.c
@@ -320,7 +320,7 @@ static bool amdgpu_atrm_get_bios(struct amdgpu_device *adev)

 adev->bios = kmalloc(size, GFP_KERNEL);
 if (!adev->bios) {
-DRM_ERROR("Unable to allocate bios\n");
+DRM_INFO("Unable to allocate bios,skipping\n");
 return false;
 }

@@ -368,7 +368,7 @@ static bool amdgpu_acpi_vfct_bios(struct amdgpu_device 
*adev)
 return false;
 tbl_size = hdr->length;
 if (tbl_size < sizeof(UEFI_ACPI_VFCT)) {
-DRM_ERROR("ACPI VFCT table present but broken (too short #1)\n");
+DRM_INFO("ACPI VFCT table present but broken (too short #1),skipping\n");
 return false;
 }

@@ -381,13 +381,13 @@ static bool amdgpu_acpi_vfct_bios(struct amdgpu_device 
*adev)

 offset += sizeof(VFCT_IMAGE_HEADER);
 if (offset > tbl_size) {
-DRM_ERROR("ACPI VFCT image header truncated\n");
+DRM_INFO("ACPI VFCT image header truncated,skipping\n");
 return false;
 }

 offset += vhdr->ImageLength;
 if (offset > tbl_size) {
-DRM_ERROR("ACPI VFCT image truncated\n");
+DRM_INFO("ACPI VFCT image truncated,skipping\n");
 return false;
 }

@@ -410,7 +410,7 @@ static bool amdgpu_acpi_vfct_bios(struct amdgpu_device 
*adev)
 }
 }

-DRM_ERROR("ACPI VFCT table present but broken (too short #2)\n");
+DRM_INFO("ACPI VFCT table present but broken (too short #2),skipping\n");
 return false;
 }
 #else
--
2.25.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

RE: [PATCH] drm/amdgpu: Use DRM_INFO if VFCT table not valid

2021-03-15 Thread Xu, Feifei

OK. Will add in V2

Thanks,
Feifei

-Original Message-
From: Zhang, Hawking  
Sent: Monday, March 15, 2021 3:28 PM
To: Xu, Feifei ; amd-gfx@lists.freedesktop.org
Cc: Deucher, Alexander ; Lazar, Lijo 

Subject: RE: [PATCH] drm/amdgpu: Use DRM_INFO if VFCT table not valid

[AMD Public Use]

Might be better switch to dev_err so in mGPU setup there is bdf along with the 
warning/

Regards,
Hawking

-Original Message-
From: Xu, Feifei 
Sent: Monday, March 15, 2021 15:11
To: Xu, Feifei ; amd-gfx@lists.freedesktop.org
Cc: Deucher, Alexander ; Lazar, Lijo 
; Zhang, Hawking 
Subject: RE: [PATCH] drm/amdgpu: Use DRM_INFO if VFCT table not valid

[AMD Official Use Only - Internal Distribution Only]

Sorry, please ignore this one. I will draft a V2 to remove this one:
if (!adev->bios) {
-DRM_ERROR("Unable to allocate bios\n");
+DRM_INFO("Unable to allocate bios,skipping\n");
 return false;

Thanks,
Feifei

-Original Message-
From: Feifei Xu 
Sent: Monday, March 15, 2021 2:46 PM
To: amd-gfx@lists.freedesktop.org
Cc: Deucher, Alexander ; Lazar, Lijo 
; Zhang, Hawking ; Xu, Feifei 

Subject: [PATCH] drm/amdgpu: Use DRM_INFO if VFCT table not valid

Some ASICs has no GOP driver to copy vbios image to VFCT table.
In this case, it will go to next check.

Signed-off-by: Feifei Xu 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_bios.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_bios.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_bios.c
index f454a6bd0ed6..dde27b26a735 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_bios.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_bios.c
@@ -320,7 +320,7 @@ static bool amdgpu_atrm_get_bios(struct amdgpu_device *adev)

 adev->bios = kmalloc(size, GFP_KERNEL);  if (!adev->bios) { -DRM_ERROR("Unable 
to allocate bios\n");
+DRM_INFO("Unable to allocate bios,skipping\n");
 return false;
 }

@@ -368,7 +368,7 @@ static bool amdgpu_acpi_vfct_bios(struct amdgpu_device 
*adev)  return false;  tbl_size = hdr->length;  if (tbl_size < 
sizeof(UEFI_ACPI_VFCT)) { -DRM_ERROR("ACPI VFCT table present but broken (too 
short #1)\n");
+DRM_INFO("ACPI VFCT table present but broken (too short 
+#1),skipping\n");
 return false;
 }

@@ -381,13 +381,13 @@ static bool amdgpu_acpi_vfct_bios(struct amdgpu_device 
*adev)

 offset += sizeof(VFCT_IMAGE_HEADER);
 if (offset > tbl_size) {
-DRM_ERROR("ACPI VFCT image header truncated\n");
+DRM_INFO("ACPI VFCT image header truncated,skipping\n");
 return false;
 }

 offset += vhdr->ImageLength;
 if (offset > tbl_size) {
-DRM_ERROR("ACPI VFCT image truncated\n");
+DRM_INFO("ACPI VFCT image truncated,skipping\n");
 return false;
 }

@@ -410,7 +410,7 @@ static bool amdgpu_acpi_vfct_bios(struct amdgpu_device 
*adev)  }  }

-DRM_ERROR("ACPI VFCT table present but broken (too short #2)\n");
+DRM_INFO("ACPI VFCT table present but broken (too short 
+#2),skipping\n");
 return false;
 }
 #else
--
2.25.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

RE: [PATCH] drm/amdgpu: temporarily read bounding box from gpu_info fw for navi12

2020-06-02 Thread Xu, Feifei

[AMD Official Use Only - Internal Distribution Only]

Reviewed-by: Feifei Xu 

-Original Message-
From: Tianci Yin 
Sent: Wednesday, June 3, 2020 10:08 AM
To: amd-gfx@lists.freedesktop.org
Cc: Deucher, Alexander ; Xu, Feifei 
; Yuan, Xiaojie ; Li, Pauline 
; Yin, Tianci (Rico) 
Subject: [PATCH] drm/amdgpu: temporarily read bounding box from gpu_info fw for 
navi12

From: "Tianci.Yin" 

The bounding box is still needed by Navi12, temporarily read it from gpu_info 
firmware. Should be droped when DAL no longer needs it.

Change-Id: Ifc330ec860f9b0665134a81df2fc80ca91c41a33
Reviewed-by: Alex Deucher 
Reviewed-by: Xiaojie Yuan 
Signed-off-by: Tianci.Yin 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 16 +++-
 1 file changed, 15 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 15de344438d2..1df28b7bf22e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -1537,7 +1537,14 @@ static int amdgpu_device_parse_gpu_info_fw(struct 
amdgpu_device *adev)

 if (adev->discovery_bin) {
 amdgpu_discovery_get_gfx_info(adev);
-return 0;
+
+/*
+ * FIXME: The bounding box is still needed by Navi12, so
+ * temporarily read it from gpu_info firmware. Should be droped
+ * when DAL no longer needs it.
+ */
+if (adev->asic_type != CHIP_NAVI12)
+return 0;
 }

 switch (adev->asic_type) {
@@ -1627,6 +1634,12 @@ static int amdgpu_device_parse_gpu_info_fw(struct 
amdgpu_device *adev)
 (const struct gpu_info_firmware_v1_0 *)(adev->firmware.gpu_info_fw->data +
 le32_to_cpu(hdr->header.ucode_array_offset_bytes));

+/*
+ * Should be droped when DAL no longer needs it.
+ */
+if (adev->asic_type == CHIP_NAVI12)
+goto parse_soc_bounding_box;
+
 adev->gfx.config.max_shader_engines = le32_to_cpu(gpu_info_fw->gc_num_se);
 adev->gfx.config.max_cu_per_sh = le32_to_cpu(gpu_info_fw->gc_num_cu_per_sh);
 adev->gfx.config.max_sh_per_se = le32_to_cpu(gpu_info_fw->gc_num_sh_per_se);
@@ -1655,6 +1668,7 @@ static int amdgpu_device_parse_gpu_info_fw(struct 
amdgpu_device *adev)
 le32_to_cpu(gpu_info_fw->num_packer_per_sc);
 }

+parse_soc_bounding_box:
 /*
  * soc bounding box info is not integrated in disocovery table,
  * we always need to parse it from gpu info firmware if needed.
--
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

RE: [PATCH] drm/amdgpu: only register VGA devices with the VGA arbiter

2020-11-20 Thread Xu, Feifei

[AMD Official Use Only - Internal Distribution Only]

Reviewed-by: Feifei Xu 

-Original Message-
From: amd-gfx  On Behalf Of Alex Deucher
Sent: Friday, November 20, 2020 10:55 PM
To: amd-gfx@lists.freedesktop.org
Cc: Deucher, Alexander 
Subject: [PATCH] drm/amdgpu: only register VGA devices with the VGA arbiter

We only need to arbitrate VGA access on VGA compatible devices.

Signed-off-by: Alex Deucher 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 2f60b7084f4d..2670fb113ba1 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -3346,7 +3346,8 @@ int amdgpu_device_init(struct amdgpu_device *adev,
 /* if we have > 1 VGA cards, then disable the amdgpu VGA resources */
 /* this will fail for cards that aren't VGA class devices, just
  * ignore it */
-vga_client_register(adev->pdev, adev, NULL, amdgpu_device_vga_set_decode);
+if ((adev->pdev->class >> 8) == PCI_CLASS_DISPLAY_VGA)
+vga_client_register(adev->pdev, adev, NULL, amdgpu_device_vga_set_decode);

 if (amdgpu_device_supports_boco(ddev))
 boco = true;
@@ -3605,7 +3606,8 @@ void amdgpu_device_fini(struct amdgpu_device *adev)
 vga_switcheroo_unregister_client(adev->pdev);
 if (amdgpu_device_supports_boco(adev_to_drm(adev)))
 vga_switcheroo_fini_domain_pm_ops(adev->dev);
-vga_client_register(adev->pdev, NULL, NULL, NULL);
+if ((adev->pdev->class >> 8) == PCI_CLASS_DISPLAY_VGA)
+vga_client_register(adev->pdev, NULL, NULL, NULL);
 if (adev->rio_mem)
 pci_iounmap(adev->pdev, adev->rio_mem);
 adev->rio_mem = NULL;
--
2.25.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&data=04%7C01%7CFeifei.Xu%40amd.com%7C8eac65802c2841a4016408d88d645737%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637414809390017528%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=4YGYTXm%2FY8MAQiqs4QJ1MUIcy2%2F4waucDrIeob63ogk%3D&reserved=0
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

RE: [PATCH 3/3] drm/amdgpu: de-initialize software ih ring

2020-12-20 Thread Xu, Feifei

[AMD Official Use Only - Internal Distribution Only]



Series is
Reviewed-by: Feifei Xu 

-Original Message-
From: Zhang, Hawking  
Sent: 2020年12月21日 下午 1:43
To: amd-gfx@lists.freedesktop.org; Xu, Feifei ; Koenig, 
Christian 
Cc: Zhang, Hawking 
Subject: [PATCH 3/3] drm/amdgpu: de-initialize software ih ring

tear down software ih ring and its state.

Signed-off-by: Hawking Zhang 
---
 drivers/gpu/drm/amd/amdgpu/navi10_ih.c | 1 +  
drivers/gpu/drm/amd/amdgpu/vega10_ih.c | 1 +  
drivers/gpu/drm/amd/amdgpu/vega20_ih.c | 1 +
 3 files changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/navi10_ih.c 
b/drivers/gpu/drm/amd/amdgpu/navi10_ih.c
index 04cc41b82661..060357625504 100644
--- a/drivers/gpu/drm/amd/amdgpu/navi10_ih.c
+++ b/drivers/gpu/drm/amd/amdgpu/navi10_ih.c
@@ -584,6 +584,7 @@ static int navi10_ih_sw_fini(void *handle)
struct amdgpu_device *adev = (struct amdgpu_device *)handle;
 
amdgpu_irq_fini(adev);
+   amdgpu_ih_ring_fini(adev, &adev->irq.ih_soft);
amdgpu_ih_ring_fini(adev, &adev->irq.ih2);
amdgpu_ih_ring_fini(adev, &adev->irq.ih1);
amdgpu_ih_ring_fini(adev, &adev->irq.ih); diff --git 
a/drivers/gpu/drm/amd/amdgpu/vega10_ih.c 
b/drivers/gpu/drm/amd/amdgpu/vega10_ih.c
index 1581113477cf..88626d83e07b 100644
--- a/drivers/gpu/drm/amd/amdgpu/vega10_ih.c
+++ b/drivers/gpu/drm/amd/amdgpu/vega10_ih.c
@@ -524,6 +524,7 @@ static int vega10_ih_sw_fini(void *handle)
struct amdgpu_device *adev = (struct amdgpu_device *)handle;
 
amdgpu_irq_fini(adev);
+   amdgpu_ih_ring_fini(adev, &adev->irq.ih_soft);
amdgpu_ih_ring_fini(adev, &adev->irq.ih2);
amdgpu_ih_ring_fini(adev, &adev->irq.ih1);
amdgpu_ih_ring_fini(adev, &adev->irq.ih); diff --git 
a/drivers/gpu/drm/amd/amdgpu/vega20_ih.c 
b/drivers/gpu/drm/amd/amdgpu/vega20_ih.c
index a1d4d67d5ee1..5487262527aa 100644
--- a/drivers/gpu/drm/amd/amdgpu/vega20_ih.c
+++ b/drivers/gpu/drm/amd/amdgpu/vega20_ih.c
@@ -571,6 +571,7 @@ static int vega20_ih_sw_fini(void *handle)
struct amdgpu_device *adev = (struct amdgpu_device *)handle;
 
amdgpu_irq_fini(adev);
+   amdgpu_ih_ring_fini(adev, &adev->irq.ih_soft);
amdgpu_ih_ring_fini(adev, &adev->irq.ih2);
amdgpu_ih_ring_fini(adev, &adev->irq.ih1);
amdgpu_ih_ring_fini(adev, &adev->irq.ih);
--
2.17.1
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

RE: [PATCH v2 4/4] drm/amd: Enable seamless boot by default on newer ASICs

2023-09-25 Thread Xu, Feifei

[AMD Official Use Only - General]

Hi Mario,

Navi32 which DCE3.2.0 not support this. This patch will cause modprobe fail on 
NV32.

[  +0.000126] [drm] DSC precompute is not needed.
[ +19.026503] amdgpu :03:00.0: amdgpu: SMU: I'm not done with your previous 
command: SMN_C2PMSG_66:0x002D SMN_C2PMSG_82:0x
[  +0.02] amdgpu :03:00.0: amdgpu: Failed to power gate JPEG!
[  +0.01] [drm:amdgpu_dpm_enable_jpeg [amdgpu]] *ERROR* Dpm disable jpeg 
failed, ret = -62.
[9月26 11:00] amdgpu :03:00.0: amdgpu: SMU: I'm not done with your previous 
command: SMN_C2PMSG_66:0x002D SMN_C2PMSG_82:0x
[  +0.01] amdgpu :03:00.0: amdgpu: Failed to power gate VCN!
[  +0.00] [drm:amdgpu_dpm_enable_uvd [amdgpu]] *ERROR* Dpm disable uvd 
failed, ret = -62.
[  +3.557018] amdgpu :03:00.0: amdgpu: SMU: I'm not done with your previous 
command: SMN_C2PMSG_66:0x002D SMN_C2PMSG_82:0x
[ +16.269817] watchdog: BUG: soft lockup - CPU#7 stuck for 26s! [modprobe:28704]


Either change to:
return adev->ip_versions[DCE_HWIP][0] == IP_VERSION(3, 2, 0);

Or revert [PATCH v2 4/4] drm/amd: Enable seamless boot by default on newer 
ASICs both ok.

Thanks,
Feifei

-Original Message-
From: amd-gfx  On Behalf Of Mario 
Limonciello
Sent: Thursday, September 14, 2023 1:15 AM
To: amd-gfx@lists.freedesktop.org
Cc: Deucher, Alexander ; Wentland, Harry 
; Limonciello, Mario 
Subject: [PATCH v2 4/4] drm/amd: Enable seamless boot by default on newer ASICs

Seamless boot can technically be supported as far back as DCN1 but to avoid 
regressions on older hardware, enable it for DCN3 and later.

If users report using the module parameter that it works on older ASICs as 
well, this can be adjusted.

Signed-off-by: Mario Limonciello 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 15 ---
 1 file changed, 4 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 2116e016178a..38fafed31a1b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -1361,9 +1361,9 @@ bool amdgpu_device_need_post(struct amdgpu_device *adev)
 /*
  * Check whether seamless boot is supported.
  *
- * So far we only support seamless boot on select ASICs.
- * If everything goes well, we may consider expanding
- * seamless boot to other ASICs.
+ * So far we only support seamless boot on DCE 3.0 or later.
+ * If users report that it works on older ASICS as well, we may
+ * loosen this.
  */
 bool amdgpu_device_seamless_boot_supported(struct amdgpu_device *adev)  { @@ 
-1383,14 +1383,7 @@ bool amdgpu_device_seamless_boot_supported(struct 
amdgpu_device *adev)
if (adev->mman.keep_stolen_vga_memory)
return false;

-   switch (adev->ip_versions[DCE_HWIP][0]) {
-   case IP_VERSION(3, 0, 1):
-   return true;
-   default:
-   break;
-   }
-
-   return false;
+   return adev->ip_versions[DCE_HWIP][0] > IP_VERSION(3, 0, 0);
 }

 /*
--
2.34.1

RE: [PATCH] drm/amdgpu:Check gfx poweron when skip flush_gpu_tlb

2023-10-08 Thread Xu, Feifei

[AMD Official Use Only - General]

Hi,

>> Based on your description, the above code should use "||" instead of "&&",
&& is to add more restriction here.  To avoid skipping necessary TLB flush by 
return.
For Asics < GFX11, !adev->gfx.is_poweron is always true (this paremeter is 
intrudoced from GFX11), only depends on reset_domain->sem;
For Asics = GFX11, !adev->gfx.is_poweron might be false (which gfx might 
already poweron in the reset), this will make the if () not ture, return will 
not be executed, thus flush TLB.

>> And after merging code into one line may result in the lock not being 
>> released if the lock can be acquired success.
If !adev->gfx.is_poweron is true, the reset_domin->sem will not be 
down_read_trylock, thus could avoid this deadlock.

Thanks,
Feifei

-Original Message-----
From: Wang, Yang(Kevin) 
Sent: Sunday, October 8, 2023 9:36 PM
To: Xu, Feifei ; amd-gfx@lists.freedesktop.org
Cc: Xu, Feifei ; Xu, Feifei ; Koenig, 
Christian ; Zhang, Hawking 
Subject: RE: [PATCH] drm/amdgpu:Check gfx poweron when skip flush_gpu_tlb


-Original Message-
From: amd-gfx  On Behalf Of Feifei Xu
Sent: Sunday, October 8, 2023 6:07 PM
To: amd-gfx@lists.freedesktop.org
Cc: Xu, Feifei ; Xu, Feifei ; Koenig, 
Christian ; Zhang, Hawking 
Subject: [PATCH] drm/amdgpu:Check gfx poweron when skip flush_gpu_tlb

To fix the gpu recovery failed on GFX11 with gfxhub pagefault, flush gpu tlb 
after reset on GFX11.
Gfxhub tlb flush need check if adev->gfx.is_poweron set.

Fixes: d0c860f33553 ("drm/amdgpu: rework lock handling for flush_tlb v2")

Signed-off-by: Feifei Xu 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
index 2f9bb86edd71..048d32edee88 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
@@ -611,8 +611,9 @@ void amdgpu_gmc_flush_gpu_tlb(struct amdgpu_device *adev, 
uint32_t vmid,
/*
 * A GPU reset should flush all TLBs anyway, so no need to do
 * this while one is ongoing.
+* For GFX11, gfxhub flush check if adev->gfx.is_poweron is set.
 */
-   if (!down_read_trylock(&adev->reset_domain->sem))
+   if (!down_read_trylock(&adev->reset_domain->sem) &&
+!adev->gfx.is_poweron)
return;

[Kevin]:
Based on your description, the above code should use "||" instead of "&&",
And after merging code into one line may result in the lock not being released 
if the lock can be acquired success.

Best Regards,
Kevin

if (adev->gmc.flush_tlb_needs_extra_type_2)
--
2.34.1

RE: [PATCH] drm/amdgpu:Check gfx poweron when skip flush_gpu_tlb

2023-10-09 Thread Xu, Feifei

[AMD Official Use Only - General]

Yes, adev->gfx.is_poweron check will be applied in gmc_v11 callback, which will 
be called after the generic gmc part: amdgpu_gmc_flush_gpu_tlb() function.
But in commit: d0c860f33553 ("drm/amdgpu: rework lock handling for flush_tlb 
v2"), the flush is moved at a higher level amdgpu_gmc_flush_gpu_tlb() function.

Thus the gmc_v11 callback will never be called in the resume because 
adev->reset_domain->sem not released and returned ahead. Adding a check of 
adev->gfx.is_poweron will let GFX11 not breaking ahead, like following:

if (!down_read_trylock(&adev->reset_domain->sem) && //--> true in gfx11
+!adev->gfx.is_poweron) //-->false in gfx11, and the whole if statement 
will be false, not return ahead. The following gmc v11 callback will be 
executed later.

Thanks,
Feifei

-Original Message-
From: Zhang, Hawking 
Sent: Monday, October 9, 2023 4:58 PM
To: Xu, Feifei ; Wang, Yang(Kevin) ; 
amd-gfx@lists.freedesktop.org
Cc: Koenig, Christian 
Subject: RE: [PATCH] drm/amdgpu:Check gfx poweron when skip flush_gpu_tlb

[AMD Official Use Only - General]

adev->gfx.is_poweron check should already be applied in IP specific (gmc v11) 
callback. If gfx is not power on, it does nothing but just returns. I didn't 
see how it helps resolve the issue if we just move the check from one function 
to another.

Regards,
Hawking

-Original Message-
From: Xu, Feifei 
Sent: Monday, October 9, 2023 09:51
To: Wang, Yang(Kevin) ; amd-gfx@lists.freedesktop.org
Cc: Koenig, Christian ; Zhang, Hawking 

Subject: RE: [PATCH] drm/amdgpu:Check gfx poweron when skip flush_gpu_tlb

[AMD Official Use Only - General]

Hi,

>> Based on your description, the above code should use "||" instead of
>> "&&",
&& is to add more restriction here.  To avoid skipping necessary TLB flush by 
return.
For Asics < GFX11, !adev->gfx.is_poweron is always true (this paremeter is 
intrudoced from GFX11), only depends on reset_domain->sem; For Asics = GFX11, 
!adev->gfx.is_poweron might be false (which gfx might already poweron in the 
reset), this will make the if () not ture, return will not be executed, thus 
flush TLB.

>> And after merging code into one line may result in the lock not being 
>> released if the lock can be acquired success.
If !adev->gfx.is_poweron is true, the reset_domin->sem will not be 
down_read_trylock, thus could avoid this deadlock.

Thanks,
Feifei

-Original Message-----
From: Wang, Yang(Kevin) 
Sent: Sunday, October 8, 2023 9:36 PM
To: Xu, Feifei ; amd-gfx@lists.freedesktop.org
Cc: Xu, Feifei ; Xu, Feifei ; Koenig, 
Christian ; Zhang, Hawking 
Subject: RE: [PATCH] drm/amdgpu:Check gfx poweron when skip flush_gpu_tlb


-Original Message-----
From: amd-gfx  On Behalf Of Feifei Xu
Sent: Sunday, October 8, 2023 6:07 PM
To: amd-gfx@lists.freedesktop.org
Cc: Xu, Feifei ; Xu, Feifei ; Koenig, 
Christian ; Zhang, Hawking 
Subject: [PATCH] drm/amdgpu:Check gfx poweron when skip flush_gpu_tlb

To fix the gpu recovery failed on GFX11 with gfxhub pagefault, flush gpu tlb 
after reset on GFX11.
Gfxhub tlb flush need check if adev->gfx.is_poweron set.

Fixes: d0c860f33553 ("drm/amdgpu: rework lock handling for flush_tlb v2")

Signed-off-by: Feifei Xu 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
index 2f9bb86edd71..048d32edee88 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
@@ -611,8 +611,9 @@ void amdgpu_gmc_flush_gpu_tlb(struct amdgpu_device *adev, 
uint32_t vmid,
/*
 * A GPU reset should flush all TLBs anyway, so no need to do
 * this while one is ongoing.
+* For GFX11, gfxhub flush check if adev->gfx.is_poweron is set.
 */
-   if (!down_read_trylock(&adev->reset_domain->sem))
+   if (!down_read_trylock(&adev->reset_domain->sem) &&
+!adev->gfx.is_poweron)
return;

[Kevin]:
Based on your description, the above code should use "||" instead of "&&", And 
after merging code into one line may result in the lock not being released if 
the lock can be acquired success.

Best Regards,
Kevin

if (adev->gmc.flush_tlb_needs_extra_type_2)
--
2.34.1

RE: [PATCH] drm/amdgpu:Check gfx poweron when skip flush_gpu_tlb

2023-10-10 Thread Xu, Feifei

[AMD Official Use Only - General]

If gfx is not power on, both check will return ahead. The logic will not change.
If gfx is power on early in resume, tlb flush in the IP specific (gmc v11) 
callback will never be called because it returned ahead in the higher level 
check in amdgpu_gmc_flush_gpu_tlb() :

if (!down_read_trylock(&adev->reset_domain->sem) && //--> true in gfx11
+!adev->gfx.is_poweron) //--> (!adev->gfx.is_poweron) = false in gfx11, 
and the whole if statement will be false, not return ahead.

Thanks,
Feifei

-----Original Message-
From: Xu, Feifei
Sent: Tuesday, October 10, 2023 10:28 AM
To: Zhang, Hawking ; Wang, Yang(Kevin) 
; amd-gfx@lists.freedesktop.org
Cc: Koenig, Christian 
Subject: RE: [PATCH] drm/amdgpu:Check gfx poweron when skip flush_gpu_tlb

Yes, adev->gfx.is_poweron check will be applied in gmc_v11 callback, which will 
be called after the generic gmc part: amdgpu_gmc_flush_gpu_tlb() function.
But in commit: d0c860f33553 ("drm/amdgpu: rework lock handling for flush_tlb 
v2"), the flush is moved at a higher level amdgpu_gmc_flush_gpu_tlb() function.

Thus the gmc_v11 callback will never be called in the resume because 
adev->reset_domain->sem not released and returned ahead. Adding a check of 
adev->gfx.is_poweron will let GFX11 not breaking ahead, like following:

if (!down_read_trylock(&adev->reset_domain->sem) && //--> true in gfx11
+!adev->gfx.is_poweron) //-->false in gfx11, and the whole if statement 
will be false, not return ahead. The following gmc v11 callback will be 
executed later.

Thanks,
Feifei

-Original Message-
From: Zhang, Hawking 
Sent: Monday, October 9, 2023 4:58 PM
To: Xu, Feifei ; Wang, Yang(Kevin) ; 
amd-gfx@lists.freedesktop.org
Cc: Koenig, Christian 
Subject: RE: [PATCH] drm/amdgpu:Check gfx poweron when skip flush_gpu_tlb

[AMD Official Use Only - General]

adev->gfx.is_poweron check should already be applied in IP specific (gmc v11) 
callback. If gfx is not power on, it does nothing but just returns. I didn't 
see how it helps resolve the issue if we just move the check from one function 
to another.

Regards,
Hawking

-Original Message-
From: Xu, Feifei 
Sent: Monday, October 9, 2023 09:51
To: Wang, Yang(Kevin) ; amd-gfx@lists.freedesktop.org
Cc: Koenig, Christian ; Zhang, Hawking 

Subject: RE: [PATCH] drm/amdgpu:Check gfx poweron when skip flush_gpu_tlb

[AMD Official Use Only - General]

Hi,

>> Based on your description, the above code should use "||" instead of
>> "&&",
&& is to add more restriction here.  To avoid skipping necessary TLB flush by 
return.
For Asics < GFX11, !adev->gfx.is_poweron is always true (this paremeter is 
intrudoced from GFX11), only depends on reset_domain->sem; For Asics = GFX11, 
!adev->gfx.is_poweron might be false (which gfx might already poweron in the 
reset), this will make the if () not ture, return will not be executed, thus 
flush TLB.

>> And after merging code into one line may result in the lock not being 
>> released if the lock can be acquired success.
If !adev->gfx.is_poweron is true, the reset_domin->sem will not be 
down_read_trylock, thus could avoid this deadlock.

Thanks,
Feifei

-Original Message-
From: Wang, Yang(Kevin) 
Sent: Sunday, October 8, 2023 9:36 PM
To: Xu, Feifei ; amd-gfx@lists.freedesktop.org
Cc: Xu, Feifei ; Xu, Feifei ; Koenig, 
Christian ; Zhang, Hawking 
Subject: RE: [PATCH] drm/amdgpu:Check gfx poweron when skip flush_gpu_tlb


-Original Message-
From: amd-gfx  On Behalf Of Feifei Xu
Sent: Sunday, October 8, 2023 6:07 PM
To: amd-gfx@lists.freedesktop.org
Cc: Xu, Feifei ; Xu, Feifei ; Koenig, 
Christian ; Zhang, Hawking 
Subject: [PATCH] drm/amdgpu:Check gfx poweron when skip flush_gpu_tlb

To fix the gpu recovery failed on GFX11 with gfxhub pagefault, flush gpu tlb 
after reset on GFX11.
Gfxhub tlb flush need check if adev->gfx.is_poweron set.

Fixes: d0c860f33553 ("drm/amdgpu: rework lock handling for flush_tlb v2")

Signed-off-by: Feifei Xu 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
index 2f9bb86edd71..048d32edee88 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
@@ -611,8 +611,9 @@ void amdgpu_gmc_flush_gpu_tlb(struct amdgpu_device *adev, 
uint32_t vmid,
/*
 * A GPU reset should flush all TLBs anyway, so no need to do
 * this while one is ongoing.
+* For GFX11, gfxhub flush check if adev->gfx.is_poweron is set.
 */
-   if (!down_read_trylock(&adev->reset_domain->sem))
+   if (!down_read_trylock(&adev->reset_domain->sem

RE: [PATCH] drm/amdgpu:Check gfx poweron when skip flush_gpu_tlb

2023-10-10 Thread Xu, Feifei

[AMD Official Use Only - General]

>> Then a TLB flush shouldn't be necessary on reset. A reset implies that the 
>> TLB is cleared as well.
Hmm, in current implementation, when we say a reset implied that the TLB is 
cleared, assume that the TLB clear is purely hardware action. There's no gpu 
tlb flush initiated by software/driver after suspend.

While in some asics of gfx11 (like nv31), gpu tlb need to be flushed by 
software/driver after smu resume successfully intentionally.
Without the gpu tlb flush on nv31, S3 or reset will be break with gfx page 
fault.

>> First of all the patch is broken because you only handle the locking, but 
>> not the unlocking part.
For the unlocking part, realized that you and Kevin are correct. I should put 
the trylock checking on the right of && to make sure it will be checked 
firstly. Otherwise lock/unlock not paried.

Thanks,
Feifei

-Original Message-
From: Christian König 
Sent: Monday, October 9, 2023 4:52 PM
To: Xu, Feifei ; Wang, Yang(Kevin) ; 
amd-gfx@lists.freedesktop.org
Cc: Koenig, Christian ; Zhang, Hawking 

Subject: Re: [PATCH] drm/amdgpu:Check gfx poweron when skip flush_gpu_tlb

Am 09.10.23 um 03:50 schrieb Xu, Feifei:
> [AMD Official Use Only - General]
>
> Hi,
>
>>> Based on your description, the above code should use "||" instead of
>>> "&&",
> && is to add more restriction here.  To avoid skipping necessary TLB flush by 
> return.
> For Asics < GFX11, !adev->gfx.is_poweron is always true (this
> paremeter is intrudoced from GFX11), only depends on reset_domain->sem; For 
> Asics = GFX11, !adev->gfx.is_poweron might be false (which gfx might already 
> poweron in the reset), this will make the if () not ture, return will not be 
> executed, thus flush TLB.

First of all the patch is broken because you only handle the locking, but not 
the unlocking part.

Then a TLB flush shouldn't be necessary on reset. A reset implies that the TLB 
is cleared as well.

We discussed the possibility to avoid that, but this is not supposed to be 
happening at the moment.

Regards,
Christian.

>
>>> And after merging code into one line may result in the lock not being 
>>> released if the lock can be acquired success.
> If !adev->gfx.is_poweron is true, the reset_domin->sem will not be 
> down_read_trylock, thus could avoid this deadlock.

>
> Thanks,
> Feifei
>
> -Original Message-
> From: Wang, Yang(Kevin) 
> Sent: Sunday, October 8, 2023 9:36 PM
> To: Xu, Feifei ; amd-gfx@lists.freedesktop.org
> Cc: Xu, Feifei ; Xu, Feifei ;
> Koenig, Christian ; Zhang, Hawking
> 
> Subject: RE: [PATCH] drm/amdgpu:Check gfx poweron when skip
> flush_gpu_tlb
>
>
> -Original Message-
> From: amd-gfx  On Behalf Of
> Feifei Xu
> Sent: Sunday, October 8, 2023 6:07 PM
> To: amd-gfx@lists.freedesktop.org
> Cc: Xu, Feifei ; Xu, Feifei ;
> Koenig, Christian ; Zhang, Hawking
> 
> Subject: [PATCH] drm/amdgpu:Check gfx poweron when skip flush_gpu_tlb
>
> To fix the gpu recovery failed on GFX11 with gfxhub pagefault, flush gpu tlb 
> after reset on GFX11.
> Gfxhub tlb flush need check if adev->gfx.is_poweron set.
>
> Fixes: d0c860f33553 ("drm/amdgpu: rework lock handling for flush_tlb
> v2")
>
> Signed-off-by: Feifei Xu 
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 3 ++-
>   1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
> index 2f9bb86edd71..048d32edee88 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
> @@ -611,8 +611,9 @@ void amdgpu_gmc_flush_gpu_tlb(struct amdgpu_device *adev, 
> uint32_t vmid,
>  /*
>   * A GPU reset should flush all TLBs anyway, so no need to do
>   * this while one is ongoing.
> +* For GFX11, gfxhub flush check if adev->gfx.is_poweron is 
> set.
>   */
> -   if (!down_read_trylock(&adev->reset_domain->sem))
> +   if (!down_read_trylock(&adev->reset_domain->sem) &&
> +!adev->gfx.is_poweron)
>  return;
>
> [Kevin]:
> Based on your description, the above code should use "||" instead of
> "&&", And after merging code into one line may result in the lock not being 
> released if the lock can be acquired success.
>
> Best Regards,
> Kevin
>
>  if (adev->gmc.flush_tlb_needs_extra_type_2)
> --
> 2.34.1
>

Recall: [PATCH] drm/amdgpu:Check gfx poweron when skip flush_gpu_tlb

2023-10-10 Thread Xu, Feifei

Xu, Feifei would like to recall the message, "[PATCH] drm/amdgpu:Check gfx 
poweron when skip flush_gpu_tlb".

RE: [PATCH] drm/amdgpu:Check gfx poweron when skip flush_gpu_tlb

2023-10-10 Thread Xu, Feifei

[AMD Official Use Only - General]

>> Then a TLB flush shouldn't be necessary on reset. A reset implies that the 
>> TLB is cleared as well.
Hmm, in current implementation, when we say a reset implied that the TLB is 
cleared, assume that the TLB clear is purely hardware action. There's no gpu 
tlb flush initiated by software/driver after suspend.

While in some asics of gfx11 (like nv31), gpu tlb need to be flushed by 
software/driver after smu resume successfully intentionally.
Without the gpu tlb flush on nv31, S3 or reset will be break with gfx page 
fault.

>> First of all the patch is broken because you only handle the locking, but 
>> not the unlocking part.
For the unlocking part, realized that you and Kevin are correct. Lock/unlock 
not paried.

Thanks,
Feifei

-Original Message-
From: Christian König 
Sent: Monday, October 9, 2023 4:52 PM
To: Xu, Feifei ; Wang, Yang(Kevin) ; 
amd-gfx@lists.freedesktop.org
Cc: Koenig, Christian ; Zhang, Hawking 

Subject: Re: [PATCH] drm/amdgpu:Check gfx poweron when skip flush_gpu_tlb

Am 09.10.23 um 03:50 schrieb Xu, Feifei:
> [AMD Official Use Only - General]
>
> Hi,
>
>>> Based on your description, the above code should use "||" instead of
>>> "&&",
> && is to add more restriction here.  To avoid skipping necessary TLB flush by 
> return.
> For Asics < GFX11, !adev->gfx.is_poweron is always true (this
> paremeter is intrudoced from GFX11), only depends on reset_domain->sem; For 
> Asics = GFX11, !adev->gfx.is_poweron might be false (which gfx might already 
> poweron in the reset), this will make the if () not ture, return will not be 
> executed, thus flush TLB.

First of all the patch is broken because you only handle the locking, but not 
the unlocking part.

Then a TLB flush shouldn't be necessary on reset. A reset implies that the TLB 
is cleared as well.

We discussed the possibility to avoid that, but this is not supposed to be 
happening at the moment.

Regards,
Christian.

>
>>> And after merging code into one line may result in the lock not being 
>>> released if the lock can be acquired success.
> If !adev->gfx.is_poweron is true, the reset_domin->sem will not be 
> down_read_trylock, thus could avoid this deadlock.

>
> Thanks,
> Feifei
>
> -Original Message-
> From: Wang, Yang(Kevin) 
> Sent: Sunday, October 8, 2023 9:36 PM
> To: Xu, Feifei ; amd-gfx@lists.freedesktop.org
> Cc: Xu, Feifei ; Xu, Feifei ;
> Koenig, Christian ; Zhang, Hawking
> 
> Subject: RE: [PATCH] drm/amdgpu:Check gfx poweron when skip
> flush_gpu_tlb
>
>
> -Original Message-
> From: amd-gfx  On Behalf Of
> Feifei Xu
> Sent: Sunday, October 8, 2023 6:07 PM
> To: amd-gfx@lists.freedesktop.org
> Cc: Xu, Feifei ; Xu, Feifei ;
> Koenig, Christian ; Zhang, Hawking
> 
> Subject: [PATCH] drm/amdgpu:Check gfx poweron when skip flush_gpu_tlb
>
> To fix the gpu recovery failed on GFX11 with gfxhub pagefault, flush gpu tlb 
> after reset on GFX11.
> Gfxhub tlb flush need check if adev->gfx.is_poweron set.
>
> Fixes: d0c860f33553 ("drm/amdgpu: rework lock handling for flush_tlb
> v2")
>
> Signed-off-by: Feifei Xu 
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 3 ++-
>   1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
> index 2f9bb86edd71..048d32edee88 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
> @@ -611,8 +611,9 @@ void amdgpu_gmc_flush_gpu_tlb(struct amdgpu_device *adev, 
> uint32_t vmid,
>  /*
>   * A GPU reset should flush all TLBs anyway, so no need to do
>   * this while one is ongoing.
> +* For GFX11, gfxhub flush check if adev->gfx.is_poweron is 
> set.
>   */
> -   if (!down_read_trylock(&adev->reset_domain->sem))
> +   if (!down_read_trylock(&adev->reset_domain->sem) &&
> +!adev->gfx.is_poweron)
>  return;
>
> [Kevin]:
> Based on your description, the above code should use "||" instead of
> "&&", And after merging code into one line may result in the lock not being 
> released if the lock can be acquired success.
>
> Best Regards,
> Kevin
>
>  if (adev->gmc.flush_tlb_needs_extra_type_2)
> --
> 2.34.1
>

RE: [PATCH] drm/amdgpu:Check gfx poweron when skip flush_gpu_tlb

2023-10-10 Thread Xu, Feifei

[AMD Official Use Only - General]

For the unlocking, I have tested on both nv21 and nv31, the unlock/lock paring 
looks not break.

On asic gfx.is_poweron) always true, this parameter is 
introduced from GFX11.
On gfx11, in the reset (suspend then resume) process,  after suspend, gfx 
poweron right after smu resumed successfully.
The (!adev->gfx.is_poweron) is always false  when trylock the reset_domin->sem. 
Not return ahead in gfx11 and continue gpu tlb flush in IP specific (gmc v11) 
callback.

Then unlock after gpu tlb flush:

void amdgpu_gmc_flush_gpu_tlb(struct amdgpu_device *adev, uint32_t vmid,
  uint32_t vmhub, uint32_t flush_type)
{
if (!down_read_trylock(&adev->reset_domain->sem) && 
!adev->gfx.is_poweron)  //lock
return;
...

adev->gmc.gmc_funcs->flush_gpu_tlb(adev, vmid, vmhub,
   flush_type);
up_read(&adev->reset_domain->sem);   //unlock
return;
}

Thanks,
Feifei

-Original Message-
From: Xu, Feifei
Sent: Tuesday, October 10, 2023 5:44 PM
To: Christian König ; Wang, Yang(Kevin) 
; amd-gfx@lists.freedesktop.org
Cc: Koenig, Christian ; Zhang, Hawking 

Subject: RE: [PATCH] drm/amdgpu:Check gfx poweron when skip flush_gpu_tlb

>> Then a TLB flush shouldn't be necessary on reset. A reset implies that the 
>> TLB is cleared as well.
Hmm, in current implementation, when we say a reset implied that the TLB is 
cleared, assume that the TLB clear is purely hardware action. There's no gpu 
tlb flush initiated by software/driver after suspend.

While in some asics of gfx11 (like nv31), gpu tlb need to be flushed by 
software/driver after smu resume successfully intentionally.
Without the gpu tlb flush on nv31, S3 or reset will be break with gfx page 
fault.

>> First of all the patch is broken because you only handle the locking, but 
>> not the unlocking part.
For the unlocking part, realized that you and Kevin are correct. Lock/unlock 
not paried.

Thanks,
Feifei

-Original Message-
From: Christian König 
Sent: Monday, October 9, 2023 4:52 PM
To: Xu, Feifei ; Wang, Yang(Kevin) ; 
amd-gfx@lists.freedesktop.org
Cc: Koenig, Christian ; Zhang, Hawking 

Subject: Re: [PATCH] drm/amdgpu:Check gfx poweron when skip flush_gpu_tlb

Am 09.10.23 um 03:50 schrieb Xu, Feifei:
> [AMD Official Use Only - General]
>
> Hi,
>
>>> Based on your description, the above code should use "||" instead of
>>> "&&",
> && is to add more restriction here.  To avoid skipping necessary TLB flush by 
> return.
> For Asics < GFX11, !adev->gfx.is_poweron is always true (this
> paremeter is intrudoced from GFX11), only depends on reset_domain->sem; For 
> Asics = GFX11, !adev->gfx.is_poweron might be false (which gfx might already 
> poweron in the reset), this will make the if () not ture, return will not be 
> executed, thus flush TLB.

First of all the patch is broken because you only handle the locking, but not 
the unlocking part.

Then a TLB flush shouldn't be necessary on reset. A reset implies that the TLB 
is cleared as well.

We discussed the possibility to avoid that, but this is not supposed to be 
happening at the moment.

Regards,
Christian.

>
>>> And after merging code into one line may result in the lock not being 
>>> released if the lock can be acquired success.
> If !adev->gfx.is_poweron is true, the reset_domin->sem will not be 
> down_read_trylock, thus could avoid this deadlock.

>
> Thanks,
> Feifei
>
> -Original Message-
> From: Wang, Yang(Kevin) 
> Sent: Sunday, October 8, 2023 9:36 PM
> To: Xu, Feifei ; amd-gfx@lists.freedesktop.org
> Cc: Xu, Feifei ; Xu, Feifei ;
> Koenig, Christian ; Zhang, Hawking
> 
> Subject: RE: [PATCH] drm/amdgpu:Check gfx poweron when skip
> flush_gpu_tlb
>
>
> -Original Message-
> From: amd-gfx  On Behalf Of
> Feifei Xu
> Sent: Sunday, October 8, 2023 6:07 PM
> To: amd-gfx@lists.freedesktop.org
> Cc: Xu, Feifei ; Xu, Feifei ;
> Koenig, Christian ; Zhang, Hawking
> 
> Subject: [PATCH] drm/amdgpu:Check gfx poweron when skip flush_gpu_tlb
>
> To fix the gpu recovery failed on GFX11 with gfxhub pagefault, flush gpu tlb 
> after reset on GFX11.
> Gfxhub tlb flush need check if adev->gfx.is_poweron set.
>
> Fixes: d0c860f33553 ("drm/amdgpu: rework lock handling for flush_tlb
> v2")
>
> Signed-off-by: Feifei Xu 
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 3 ++-
>   1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
> index 2f9bb86edd71..048d32edee88 1006

RE: [PATCH] drm/amdgpu:Check gfx poweron when skip flush_gpu_tlb

2023-10-10 Thread Xu, Feifei

[AMD Official Use Only - General]

If the behavior is correct, this patch looks like workaround HW reset not 
flushed the TLB or something can be workaround by adding a gpu TLB flush.

Thanks,
Feifei

-Original Message-
From: Koenig, Christian 
Sent: Tuesday, October 10, 2023 5:07 PM
To: Xu, Feifei ; Zhang, Hawking ; 
Wang, Yang(Kevin) ; amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/amdgpu:Check gfx poweron when skip flush_gpu_tlb

Hi Feifei,

yeah, that is correct behavior. The GMC callback should *not* get called during 
resume in a reset, because the reset needs to take care of invalidating the TLB 
anyway.

If the later doesn't work any more we need to re-iterate the reset procedure 
and not mess with this here.

Regards,
Christian.

Am 10.10.23 um 04:27 schrieb Xu, Feifei:
> [AMD Official Use Only - General]
>
> Yes, adev->gfx.is_poweron check will be applied in gmc_v11 callback, which 
> will be called after the generic gmc part: amdgpu_gmc_flush_gpu_tlb() 
> function.
> But in commit: d0c860f33553 ("drm/amdgpu: rework lock handling for flush_tlb 
> v2"), the flush is moved at a higher level amdgpu_gmc_flush_gpu_tlb() 
> function.
>
> Thus the gmc_v11 callback will never be called in the resume because 
> adev->reset_domain->sem not released and returned ahead. Adding a check of 
> adev->gfx.is_poweron will let GFX11 not breaking ahead, like following:
>
> if (!down_read_trylock(&adev->reset_domain->sem) && //--> true in
> gfx11
> +!adev->gfx.is_poweron) //-->false in gfx11, and the whole if 
> statement will be false, not return ahead. The following gmc v11 callback 
> will be executed later.
>
> Thanks,
> Feifei
>
> -Original Message-
> From: Zhang, Hawking 
> Sent: Monday, October 9, 2023 4:58 PM
> To: Xu, Feifei ; Wang, Yang(Kevin)
> ; amd-gfx@lists.freedesktop.org
> Cc: Koenig, Christian 
> Subject: RE: [PATCH] drm/amdgpu:Check gfx poweron when skip
> flush_gpu_tlb
>
> [AMD Official Use Only - General]
>
> adev->gfx.is_poweron check should already be applied in IP specific (gmc v11) 
> callback. If gfx is not power on, it does nothing but just returns. I didn't 
> see how it helps resolve the issue if we just move the check from one 
> function to another.
>
> Regards,
> Hawking
>
> -Original Message-
> From: Xu, Feifei 
> Sent: Monday, October 9, 2023 09:51
> To: Wang, Yang(Kevin) ;
> amd-gfx@lists.freedesktop.org
> Cc: Koenig, Christian ; Zhang, Hawking
> 
> Subject: RE: [PATCH] drm/amdgpu:Check gfx poweron when skip
> flush_gpu_tlb
>
> [AMD Official Use Only - General]
>
> Hi,
>
>>> Based on your description, the above code should use "||" instead of
>>> "&&",
> && is to add more restriction here.  To avoid skipping necessary TLB flush by 
> return.
> For Asics < GFX11, !adev->gfx.is_poweron is always true (this paremeter is 
> intrudoced from GFX11), only depends on reset_domain->sem; For Asics = GFX11, 
> !adev->gfx.is_poweron might be false (which gfx might already poweron in the 
> reset), this will make the if () not ture, return will not be executed, thus 
> flush TLB.
>
>>> And after merging code into one line may result in the lock not being 
>>> released if the lock can be acquired success.
> If !adev->gfx.is_poweron is true, the reset_domin->sem will not be 
> down_read_trylock, thus could avoid this deadlock.
>
> Thanks,
> Feifei
>
> -Original Message-
> From: Wang, Yang(Kevin) 
> Sent: Sunday, October 8, 2023 9:36 PM
> To: Xu, Feifei ; amd-gfx@lists.freedesktop.org
> Cc: Xu, Feifei ; Xu, Feifei ;
> Koenig, Christian ; Zhang, Hawking
> 
> Subject: RE: [PATCH] drm/amdgpu:Check gfx poweron when skip
> flush_gpu_tlb
>
>
> -Original Message-
> From: amd-gfx  On Behalf Of
> Feifei Xu
> Sent: Sunday, October 8, 2023 6:07 PM
> To: amd-gfx@lists.freedesktop.org
> Cc: Xu, Feifei ; Xu, Feifei ;
> Koenig, Christian ; Zhang, Hawking
> 
> Subject: [PATCH] drm/amdgpu:Check gfx poweron when skip flush_gpu_tlb
>
> To fix the gpu recovery failed on GFX11 with gfxhub pagefault, flush gpu tlb 
> after reset on GFX11.
> Gfxhub tlb flush need check if adev->gfx.is_poweron set.
>
> Fixes: d0c860f33553 ("drm/amdgpu: rework lock handling for flush_tlb
> v2")
>
> Signed-off-by: Feifei Xu 
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 3 ++-
>   1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
> index 2f9bb86edd71..048d32edee88 100644
> --- a/drivers/gpu/drm/amd/amdgpu/am

RE: [PATCH] drm/amdgpu:Check gfx poweron when skip flush_gpu_tlb

2023-10-11 Thread Xu, Feifei

[AMD Official Use Only - General]

Hi,

Can I get the RB for this patch? To fix the reset failure with following 
calltrace?

[   72.743001] amdgpu :03:00.0: amdgpu: [gfxhub] page fault (src_id:0 
ring:160 vmid:0 pasid:0, for process  pid 0 thread  pid 0)
[   72.743009] [drm] ring compute_32769.2.2 was added
[   72.743024] amdgpu :03:00.0: amdgpu:   in page starting at address 
0x004a2000 from client 10
[   72.743038] amdgpu :03:00.0: amdgpu: 
GCVM_L2_PROTECTION_FAULT_STATUS:0x00040B40
[   72.743050] amdgpu :03:00.0: amdgpu:  Faulty UTCL2 client ID: CPC 
(0x5)
[   72.743056] [drm] ring sdma_32769.3.3 was added
[   72.743061] amdgpu :03:00.0: amdgpu:  MORE_FAULTS: 0x0
[   72.743069] amdgpu :03:00.0: amdgpu:  WALKER_ERROR: 0x0
[   72.743077] amdgpu :03:00.0: amdgpu:  PERMISSION_FAULTS: 0x4
[   72.743086] amdgpu :03:00.0: amdgpu:  MAPPING_ERROR: 0x1
[   72.743095] amdgpu :03:00.0: amdgpu:  RW: 0x1
[   72.743105] amdgpu :03:00.0: amdgpu: [gfxhub] page fault (src_id:0 
ring:144 vmid:0 pasid:0, for process  pid 0 thread  pid 0)
[   72.743122] amdgpu :03:00.0: amdgpu:   in page starting at address 
0x004a2000 from client 10
[   72.743135] amdgpu :03:00.0: amdgpu: 
GCVM_L2_PROTECTION_FAULT_STATUS:0x0B21
[   72.743145] amdgpu :03:00.0: amdgpu:  Faulty UTCL2 client ID: CPC 
(0x5)
[   72.743155] amdgpu :03:00.0: amdgpu:  MORE_FAULTS: 0x1
[   72.743164] amdgpu :03:00.0: amdgpu:  WALKER_ERROR: 0x0
[   72.743173] amdgpu :03:00.0: amdgpu:  PERMISSION_FAULTS: 0x2
[   72.743181] amdgpu :03:00.0: amdgpu:  MAPPING_ERROR: 0x1
[   72.743189] amdgpu :03:00.0: amdgpu:  RW: 0x0

Thanks,
Feifei

-Original Message-
From: Xu, Feifei
Sent: Tuesday, October 10, 2023 6:14 PM
To: Koenig, Christian ; Zhang, Hawking 
; Wang, Yang(Kevin) ; 
amd-gfx@lists.freedesktop.org
Subject: RE: [PATCH] drm/amdgpu:Check gfx poweron when skip flush_gpu_tlb

If the behavior is correct, this patch looks like workaround HW reset not 
flushed the TLB or something can be workaround by adding a gpu TLB flush.

Thanks,
Feifei

-Original Message-
From: Koenig, Christian 
Sent: Tuesday, October 10, 2023 5:07 PM
To: Xu, Feifei ; Zhang, Hawking ; 
Wang, Yang(Kevin) ; amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/amdgpu:Check gfx poweron when skip flush_gpu_tlb

Hi Feifei,

yeah, that is correct behavior. The GMC callback should *not* get called during 
resume in a reset, because the reset needs to take care of invalidating the TLB 
anyway.

If the later doesn't work any more we need to re-iterate the reset procedure 
and not mess with this here.

Regards,
Christian.

Am 10.10.23 um 04:27 schrieb Xu, Feifei:
> [AMD Official Use Only - General]
>
> Yes, adev->gfx.is_poweron check will be applied in gmc_v11 callback, which 
> will be called after the generic gmc part: amdgpu_gmc_flush_gpu_tlb() 
> function.
> But in commit: d0c860f33553 ("drm/amdgpu: rework lock handling for flush_tlb 
> v2"), the flush is moved at a higher level amdgpu_gmc_flush_gpu_tlb() 
> function.
>
> Thus the gmc_v11 callback will never be called in the resume because 
> adev->reset_domain->sem not released and returned ahead. Adding a check of 
> adev->gfx.is_poweron will let GFX11 not breaking ahead, like following:
>
> if (!down_read_trylock(&adev->reset_domain->sem) && //--> true in
> gfx11
> +!adev->gfx.is_poweron) //-->false in gfx11, and the whole if 
> statement will be false, not return ahead. The following gmc v11 callback 
> will be executed later.
>
> Thanks,
> Feifei
>
> -Original Message-
> From: Zhang, Hawking 
> Sent: Monday, October 9, 2023 4:58 PM
> To: Xu, Feifei ; Wang, Yang(Kevin)
> ; amd-gfx@lists.freedesktop.org
> Cc: Koenig, Christian 
> Subject: RE: [PATCH] drm/amdgpu:Check gfx poweron when skip
> flush_gpu_tlb
>
> [AMD Official Use Only - General]
>
> adev->gfx.is_poweron check should already be applied in IP specific (gmc v11) 
> callback. If gfx is not power on, it does nothing but just returns. I didn't 
> see how it helps resolve the issue if we just move the check from one 
> function to another.
>
> Regards,
> Hawking
>
> -Original Message-
> From: Xu, Feifei 
> Sent: Monday, October 9, 2023 09:51
> To: Wang, Yang(Kevin) ;
> amd-gfx@lists.freedesktop.org
> Cc: Koenig, Christian ; Zhang, Hawking
> 
> Subject: RE: [PATCH] drm/amdgpu:Check gfx poweron when skip
> flush_gpu_tlb
>
> [AMD Official Use Only - General]
>
> Hi,
>
>>> Based on your description, the above code should use "||" instead of
>>> "&&",
> && is to add more restriction here.  To avoid skipping necessary TLB flush by 
> return.
>

RE: [PATCH 0/4] support query rlcv/rlcp firmware version

2022-09-15 Thread Xu, Feifei

[AMD Official Use Only - General]



Series is Reviewed-by: Feifei Xu 

-Original Message-
From: amd-gfx  On Behalf Of Hawking Zhang
Sent: Friday, September 16, 2022 1:00 AM
To: amd-gfx@lists.freedesktop.org; Gao, Likun ; Deucher, 
Alexander 
Cc: Zhang, Hawking 
Subject: [PATCH 0/4] support query rlcv/rlcp firmware version

To allow query rlcv/rlcp firmware verion info

Hawking Zhang (4):
  drm/amdgpu: save rlcv/rlcp ucode version in amdgpu_gfx
  drm/amdgpu: support print rlc v2_x ucode hdr
  drm/amdgpu: add two new subquery ids
  drm/amdgpu: add rlcv/rlcp version info to debugfs

 drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h   |   4 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c   |  24 
 drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.c | 168 +++---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.h |   4 +
 drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c|   5 +
 include/uapi/drm/amdgpu_drm.h |   4 +
 6 files changed, 159 insertions(+), 50 deletions(-)

-- 
2.17.1

RE: [PATCH] drm/amd/pm: add 3715 softpptable support for SMU13.0.0

2022-08-08 Thread Xu, Feifei

[AMD Official Use Only - General]



Reviewed-by: Feifei Xu 



-Original Message-
From: amd-gfx  On Behalf Of Evan Quan
Sent: Tuesday, August 9, 2022 9:31 AM
To: amd-gfx@lists.freedesktop.org
Cc: Deucher, Alexander ; Quan, Evan 
; Zhang, Hawking 
Subject: [PATCH] drm/amd/pm: add 3715 softpptable support for SMU13.0.0

Add support for 3715 softpptable.

Signed-off-by: Evan Quan 
Reviewed-by: Hawking Zhang 
Change-Id: Iae7360ce853a6b5fde38025d528640c9b88fc54c
---
 drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c 
b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c
index 0370482dd52b..cd159e240147 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c
@@ -229,6 +229,7 @@ int smu_v13_0_init_pptable_microcode(struct smu_context 
*smu)
/*
 * Temporary solution for SMU V13.0.0 with SCPM enabled:
 *   - use 36831 signed pptable when pp_table_id is 3683
+*   - use 37151 signed pptable when pp_table_id is 3715
 *   - use 36641 signed pptable when pp_table_id is 3664 or 0
 * TODO: drop these when the pptable carried in vbios is ready.
 */
@@ -241,6 +242,9 @@ int smu_v13_0_init_pptable_microcode(struct smu_context 
*smu)
case 3683:
pptable_id = 36831;
break;
+   case 3715:
+   pptable_id = 37151;
+   break;
default:
dev_err(adev->dev, "Unsupported pptable id 
%d\n", pptable_id);
return -EINVAL;
@@ -478,7 +482,7 @@ int smu_v13_0_setup_pptable(struct smu_context *smu)
 
/*
 * Temporary solution for SMU V13.0.0 with SCPM disabled:
-*   - use 3664 or 3683 on request
+*   - use 3664, 3683 or 3715 on request
 *   - use 3664 when pptable_id is 0
 * TODO: drop these when the pptable carried in vbios is ready.
 */
@@ -489,6 +493,7 @@ int smu_v13_0_setup_pptable(struct smu_context *smu)
break;
case 3664:
case 3683:
+   case 3715:
break;
default:
dev_err(adev->dev, "Unsupported pptable id 
%d\n", pptable_id);
-- 
2.29.0

RE: [PATCH] drm/amd/pm: skip pptable override for smu_v13_0_7

2022-08-09 Thread Xu, Feifei

[AMD Official Use Only - General]



Reviewed-by: Feifei Xu 



-Original Message-
From: amd-gfx  On Behalf Of Kenneth Feng
Sent: Tuesday, August 9, 2022 3:22 PM
To: amd-gfx@lists.freedesktop.org
Cc: Feng, Kenneth 
Subject: [PATCH] drm/amd/pm: skip pptable override for smu_v13_0_7

skip pptable override for smu_v13_0_7 secure boards only.

Signed-off-by: Kenneth Feng 
---
 drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c | 10 +++---
 1 file changed, 3 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c 
b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c
index 0370482dd52b..daf4dc9811af 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c
@@ -212,6 +212,9 @@ int smu_v13_0_init_pptable_microcode(struct smu_context 
*smu)
if (!adev->scpm_enabled)
return 0;
 
+   if (adev->ip_versions[MP1_HWIP][0] == IP_VERSION(13, 0, 7))
+   return 0;
+
/* override pptable_id from driver parameter */
if (amdgpu_smu_pptable_id >= 0) {
pptable_id = amdgpu_smu_pptable_id;
@@ -219,13 +222,6 @@ int smu_v13_0_init_pptable_microcode(struct smu_context 
*smu)
} else {
pptable_id = smu->smu_table.boot_values.pp_table_id;
 
-   if (adev->ip_versions[MP1_HWIP][0] == IP_VERSION(13, 0, 7) &&
-   pptable_id == 3667)
-   pptable_id = 36671;
-
-   if (adev->ip_versions[MP1_HWIP][0] == IP_VERSION(13, 0, 7) &&
-   pptable_id == 3688)
-   pptable_id = 36881;
/*
 * Temporary solution for SMU V13.0.0 with SCPM enabled:
 *   - use 36831 signed pptable when pp_table_id is 3683
-- 
2.25.1

RE: [PATCH] drm/amd/pm: update SMU 13.0.0 driver_if header

2022-08-24 Thread Xu, Feifei

[AMD Official Use Only - General]



Reviewed-by: Feifei Xu 



-Original Message-
From: amd-gfx  On Behalf Of Evan Quan
Sent: Tuesday, August 23, 2022 9:23 AM
To: amd-gfx@lists.freedesktop.org
Cc: Deucher, Alexander ; Quan, Evan 
; Zhang, Hawking 
Subject: [PATCH] drm/amd/pm: update SMU 13.0.0 driver_if header

To fit the latest 78.53 PMFW.

Signed-off-by: Evan Quan 
Change-Id: I16b36a3c209c82fc2d48325f7e6ef5a702678782
---
 .../inc/pmfw_if/smu13_driver_if_v13_0_0.h | 31 +++
 drivers/gpu/drm/amd/pm/swsmu/inc/smu_v13_0.h  |  2 +-
 2 files changed, 19 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/drm/amd/pm/swsmu/inc/pmfw_if/smu13_driver_if_v13_0_0.h 
b/drivers/gpu/drm/amd/pm/swsmu/inc/pmfw_if/smu13_driver_if_v13_0_0.h
index 78620b0bd279..f745cd8f1ab7 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/inc/pmfw_if/smu13_driver_if_v13_0_0.h
+++ b/drivers/gpu/drm/amd/pm/swsmu/inc/pmfw_if/smu13_driver_if_v13_0_0.h
@@ -24,12 +24,8 @@
 #ifndef SMU13_DRIVER_IF_V13_0_0_H
 #define SMU13_DRIVER_IF_V13_0_0_H
 
-// *** IMPORTANT ***
-// PMFW TEAM: Always increment the interface version on any change to this 
file -#define SMU13_DRIVER_IF_VERSION  0x23
-
 //Increment this version if SkuTable_t or BoardTable_t change -#define 
PPTABLE_VERSION 0x1D
+#define PPTABLE_VERSION 0x22
 
 #define NUM_GFXCLK_DPM_LEVELS16
 #define NUM_SOCCLK_DPM_LEVELS8
@@ -1193,8 +1189,17 @@ typedef struct {
   // SECTION: Advanced Options
   uint32_t  DebugOverrides;
 
+  // Section: Total Board Power idle vs active coefficients
+  uint8_t TotalBoardPowerSupport;
+  uint8_t TotalBoardPowerPadding[3];
+
+  int16_t TotalIdleBoardPowerM;
+  int16_t TotalIdleBoardPowerB;
+  int16_t TotalBoardPowerM;
+  int16_t TotalBoardPowerB;
+
   // SECTION: Sku Reserved
-  uint32_t Spare[64];
+  uint32_t Spare[61];
 
   // Padding for MMHUB - do not modify this
   uint32_t MmHubPadding[8];
@@ -1259,7 +1264,8 @@ typedef struct {
   // SECTION: Clock Spread Spectrum
 
   // UCLK Spread Spectrum
-  uint16_t UclkSpreadPadding;
+  uint8_t  UclkTrainingModeSpreadPercent;
+  uint8_t  UclkSpreadPadding;
   uint16_t UclkSpreadFreq;  // kHz
 
   // UCLK Spread Spectrum
@@ -1272,11 +1278,7 @@ typedef struct {
 
   // Section: Memory Config
   uint8_t  DramWidth; // Width of interface to the channel for each DRAM 
module. See DRAM_BIT_WIDTH_TYPE_e
-  uint8_t  PaddingMem1[3];
-
-  // Section: Total Board Power
-  uint16_t TotalBoardPower; //Only needed for TCP Estimated case, 
where TCP = TGP+Total Board Power
-  uint16_t BoardPowerPadding;
+  uint8_t  PaddingMem1[7];
 
   // SECTION: UMC feature flags
   uint8_t  HsrEnabled;
@@ -1375,8 +1377,11 @@ typedef struct {
   uint16_t Vcn1ActivityPercentage  ;
 
   uint32_t EnergyAccumulator;
-  uint16_t AverageSocketPower;
+  uint16_t AverageSocketPower;
+  uint16_t AverageTotalBoardPower;
+
   uint16_t AvgTemperature[TEMP_COUNT];
+  uint16_t TempPadding;
 
   uint8_t  PcieRate   ;
   uint8_t  PcieWidth  ;
diff --git a/drivers/gpu/drm/amd/pm/swsmu/inc/smu_v13_0.h 
b/drivers/gpu/drm/amd/pm/swsmu/inc/smu_v13_0.h
index 6fe2fe92ebd7..ac308e72241a 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/inc/smu_v13_0.h
+++ b/drivers/gpu/drm/amd/pm/swsmu/inc/smu_v13_0.h
@@ -30,7 +30,7 @@
 #define SMU13_DRIVER_IF_VERSION_ALDE 0x08  #define 
SMU13_DRIVER_IF_VERSION_SMU_V13_0_4 0x05  #define 
SMU13_DRIVER_IF_VERSION_SMU_V13_0_5 0x04 -#define 
SMU13_DRIVER_IF_VERSION_SMU_V13_0_0 0x2C
+#define SMU13_DRIVER_IF_VERSION_SMU_V13_0_0 0x2E
 #define SMU13_DRIVER_IF_VERSION_SMU_V13_0_7 0x2C
 
 #define SMU13_MODE1_RESET_WAIT_TIME_IN_MS 500  //500ms
--
2.29.0

RE: [PATCH] drm/amd/pm: add SMU 13.0.7 missing GetPptLimit message mapping

2023-02-06 Thread Xu, Feifei

[AMD Official Use Only - General]



Reviewed-by: Feifei Xu 

-Original Message-
From: amd-gfx  On Behalf Of Evan Quan
Sent: Friday, February 3, 2023 5:39 PM
To: amd-gfx@lists.freedesktop.org
Cc: Deucher, Alexander ; Quan, Evan 

Subject: [PATCH] drm/amd/pm: add SMU 13.0.7 missing GetPptLimit message mapping

Add missing GetPptLimit message mapping.

Signed-off-by: Evan Quan 
Change-Id: Ic4edfa3153988721a6ee66dd69a1d4ca8a5ea45c
---
 drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_7_ppt.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_7_ppt.c 
b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_7_ppt.c
index 02ee248899c0..6a882c4f7cee 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_7_ppt.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_7_ppt.c
@@ -124,6 +124,7 @@ static struct cmn2asic_msg_mapping 
smu_v13_0_7_message_map[SMU_MSG_MAX_COUNT] =
MSG_MAP(DFCstateControl,
PPSMC_MSG_SetExternalClientDfCstateAllow, 0),
MSG_MAP(ArmD3,  PPSMC_MSG_ArmD3,
   0),
MSG_MAP(AllowGpo,   PPSMC_MSG_SetGpoAllow,  
 0),
+   MSG_MAP(GetPptLimit,PPSMC_MSG_GetPptLimit,  
   0),
 };
 
 static struct cmn2asic_mapping smu_v13_0_7_clk_map[SMU_CLK_COUNT] = {
-- 
2.34.1

RE: [PATCH] drm/amd/amdgpu: fix warining during suspend

2023-02-13 Thread Xu, Feifei




Reviewed-by: Feifei Xu 

-Original Message-
From: amd-gfx  On Behalf Of Jack Xiao
Sent: Monday, February 13, 2023 6:52 PM
To: amd-gfx@lists.freedesktop.org
Cc: Xiao, Jack ; jfale...@redhat.com
Subject: [PATCH] drm/amd/amdgpu: fix warining during suspend

Freeing memory was warned during suspend.
Move the self test out of suspend.

Link: https://bugzilla.redhat.com/show_bug.cgi?id=2151825
Cc: jfale...@redhat.com
Signed-off-by: Jack Xiao 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 3 +++
 drivers/gpu/drm/amd/amdgpu/mes_v11_0.c | 2 +-
 2 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index a10b627c8357..3842e7e62eda 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -4270,6 +4270,9 @@ int amdgpu_device_resume(struct drm_device *dev, bool 
fbcon)
}
adev->in_suspend = false;
 
+   if (adev->enable_mes)
+   amdgpu_mes_self_test(adev);
+
if (amdgpu_acpi_smart_shift_update(dev, AMDGPU_SS_DEV_D0))
DRM_WARN("smart shift update failed\n");
 
diff --git a/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c 
b/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c
index 62cdd2113135..5826eac270d7 100644
--- a/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c
@@ -1284,7 +1284,7 @@ static int mes_v11_0_late_init(void *handle)
struct amdgpu_device *adev = (struct amdgpu_device *)handle;
 
/* it's only intended for use in mes_self_test case, not for s0ix and 
reset */
-   if (!amdgpu_in_reset(adev) && !adev->in_s0ix &&
+   if (!amdgpu_in_reset(adev) && !adev->in_s0ix && !adev->in_suspend &&
(adev->ip_versions[GC_HWIP][0] != IP_VERSION(11, 0, 3)))
amdgpu_mes_self_test(adev);
 
-- 
2.37.3

RE: [PATCH 2/2] drm/amd/pm: no pptable resetup on runpm exiting

2023-02-21 Thread Xu, Feifei




Series is Reviewed-by: Feifei Xu 

-Original Message-
From: amd-gfx  On Behalf Of Evan Quan
Sent: Tuesday, February 21, 2023 3:51 PM
To: amd-gfx@lists.freedesktop.org
Cc: Deucher, Alexander ; Quan, Evan 

Subject: [PATCH 2/2] drm/amd/pm: no pptable resetup on runpm exiting

It is assumed the pptable used before runpm is same as the one used afterwards. 
Thus, we can reuse the stored copy and do not need to resetup the pptable again.

Signed-off-by: Evan Quan 
Change-Id: Ib6d61f8e8cb58df45d9e24725b0672239b3ff653
---
 drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c | 15 +++
 1 file changed, 11 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c 
b/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c
index ff806a2e804f..bb25f14f0733 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c
@@ -1220,10 +1220,17 @@ static int smu_smc_hw_setup(struct smu_context *smu)
return ret;
}
 
-   ret = smu_setup_pptable(smu);
-   if (ret) {
-   dev_err(adev->dev, "Failed to setup pptable!\n");
-   return ret;
+   /*
+* It is assumed the pptable used before runpm is same as
+* the one used afterwards. Thus, we can reuse the stored
+* copy and do not need to resetup the pptable again.
+*/
+   if (!adev->in_runpm) {
+   ret = smu_setup_pptable(smu);
+   if (ret) {
+   dev_err(adev->dev, "Failed to setup pptable!\n");
+   return ret;
+   }
}
 
/* smu_dump_pptable(smu); */
--
2.34.1

RE: [PATCH] drm/amdgpu: Fix sdma v4 sw fini error

2023-04-06 Thread Xu, Feifei

[AMD Official Use Only - General]



Reviewed-by: Feifei Xu 

-Original Message-
From: lyndonli  
Sent: Thursday, April 6, 2023 4:12 PM
To: amd-gfx@lists.freedesktop.org
Cc: Xu, Feifei ; Gao, Likun ; Zhang, 
Hawking ; Li, Lyndon 
Subject: [PATCH] drm/amdgpu: Fix sdma v4 sw fini error

Fix sdma v4 sw fini error for sdma 4.2.2 to solve the following general 
protection fault

[  +0.108196] general protection fault, probably for non-canonical address 
0xd5e5a4ae79d24a32:  [#1] PREEMPT SMP PTI [  +0.18] RIP: 
0010:free_fw_priv+0xd/0x70 [  +0.22] Call Trace:
[  +0.12]  
[  +0.11]  release_firmware+0x55/0x80 [  +0.21]  
amdgpu_ucode_release+0x11/0x20 [amdgpu] [  +0.000415]  
amdgpu_sdma_destroy_inst_ctx+0x4f/0x90 [amdgpu] [  +0.000360]  
sdma_v4_0_sw_fini+0xce/0x110 [amdgpu]

Signed-off-by: lyndonli 
---
 drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c 
b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
index b5affba22156..96b0c3d42346 100644
--- a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
@@ -1870,7 +1870,7 @@ static int sdma_v4_0_sw_fini(void *handle)
amdgpu_ring_fini(&adev->sdma.instance[i].page);
}
 
-   if (adev->ip_versions[SDMA0_HWIP][0] == IP_VERSION(4, 2, 0) ||
+   if (adev->ip_versions[SDMA0_HWIP][0] == IP_VERSION(4, 2, 2) ||
 adev->ip_versions[SDMA0_HWIP][0] == IP_VERSION(4, 4, 0))
amdgpu_sdma_destroy_inst_ctx(adev, true);
else
--
2.34.1

RE: [PATCH] drm/amdgpu: extend the default timeout for kernel compute queues

2023-04-21 Thread Xu, Feifei

[AMD Official Use Only - General]

For some Vulkan stress tests, it might be not possible to rewrite using ROCm.
After a twice think, it might be too risky if we put 120s, because of the 
softlockup timeout set to 120s.

To support some stress tests like the one which recently I saw on stressbench 
(Vulkan stress test), if we shorten the 120s to a reasonable range like 100s, 
it can also fix the software hang.

-Original Message-
From: Alex Deucher  
Sent: Thursday, April 20, 2023 8:57 PM
To: Xu, Feifei 
Cc: amd-gfx@lists.freedesktop.org; Zhang, Hawking 
Subject: Re: [PATCH] drm/amdgpu: extend the default timeout for kernel compute 
queues

On Thu, Apr 20, 2023 at 5:19 AM Feifei Xu  wrote:
>
> Extend to 120s. The default timeout value should also extend if 
> compute shader execution time extended. Otherwise some stress test 
> will trigger compute ring timeout in software.

I think that's probably too long.  2 minutes is a long time to have a hung 
system.  I think we should rework the tests or use ROCm for long running test 
cases.

Alex

>
> Signed-off-by: Feifei Xu 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index e536886f6d42..1f98b4b0a549 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -3475,7 +3475,7 @@ static int 
> amdgpu_device_get_job_timeout_settings(struct amdgpu_device *adev)
>
> /*
>  * By default timeout for non compute jobs is 1
> -* and 6 for compute jobs.
> +* and 12 for compute jobs.
>  * In SR-IOV or passthrough mode, timeout for compute
>  * jobs are 6 by default.
>  */
> @@ -3485,7 +3485,7 @@ static int 
> amdgpu_device_get_job_timeout_settings(struct amdgpu_device *adev)
> adev->compute_timeout = amdgpu_sriov_is_pp_one_vf(adev) ?
> msecs_to_jiffies(6) : 
> msecs_to_jiffies(1);
> else
> -   adev->compute_timeout =  msecs_to_jiffies(6);
> +   adev->compute_timeout =  msecs_to_jiffies(12);
>
> if (strnlen(input, AMDGPU_MAX_TIMEOUT_PARAM_LENGTH)) {
> while ((timeout_setting = strsep(&input, ",")) &&
> --
> 2.34.1
>

RE: [PATCH 2/2] drm/amdgpu: Use the default reset when loading amdgpu driver

2023-04-23 Thread Xu, Feifei

[AMD Official Use Only - General]

I think you might be refer to : mod parameter reset_method will not affect the 
loading driver code path. If loading driver, it should use the default reset 
which might be mode1/mode2 or BACO instead of the specific mode2.

With the confusing commit msg corrected. And adding comment before the code " r 
= amdgpu_asic_reset(adev);"



Reviewed-by: Feifei Xu 

-Original Message-
From: lyndonli  
Sent: Monday, April 24, 2023 9:58 AM
To: amd-gfx@lists.freedesktop.org
Cc: Liu, Shaoyun ; Zhao, Victor ; 
Feng, Kenneth ; Xu, Feifei ; Li, 
Yunxiang (Teddy) ; Li, Lyndon 
Subject: [PATCH 2/2] drm/amdgpu: Use the default reset when loading amdgpu 
driver

Below call trace and errors are observed when reloading amdgpu driver with the 
module parameter reset_method=3.

It should do a mode1 reset when loading the driver.

[  +2.180243] [drm] psp gfx command ID_LOAD_TOC(0x20) failed and response 
status is (0x0) [  +0.11] [drm:psp_hw_start [amdgpu]] *ERROR* Failed to 
load toc [  +0.000890] [drm:psp_hw_start [amdgpu]] *ERROR* PSP tmr init failed!
[  +0.020683] [drm:amdgpu_fill_buffer [amdgpu]] *ERROR* Trying to clear memory 
with ring turned off.
[  +0.03] RIP: 0010:amdgpu_bo_release_notify+0x1ef/0x210 [amdgpu] [  
+0.04] Call Trace:
[  +0.03]  
[  +0.08]  ttm_bo_release+0x2c4/0x330 [amdttm] [  +0.26]  
amdttm_bo_put+0x3c/0x70 [amdttm] [  +0.20]  
amdgpu_bo_free_kernel+0xe6/0x140 [amdgpu] [  +0.000728]  
psp_v11_0_ring_destroy+0x34/0x60 [amdgpu] [  +0.000826]  psp_hw_init+0xe7/0x2f0 
[amdgpu] [  +0.000813]  amdgpu_device_fw_loading+0x1ad/0x2d0 [amdgpu] [  
+0.000731]  amdgpu_device_init.cold+0x108e/0x2002 [amdgpu] [  +0.001071]  ? 
do_pci_enable_device+0xe1/0x110 [  +0.11]  
amdgpu_driver_load_kms+0x1a/0x160 [amdgpu] [  +0.000729]  
amdgpu_pci_probe+0x179/0x3a0 [amdgpu]

Signed-off-by: lyndonli 
Signed-off-by: Yunxiang Li 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index e536886f6d42..9738e3660cf1 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -3578,6 +3578,7 @@ int amdgpu_device_init(struct amdgpu_device *adev,
int r, i;
bool px = false;
u32 max_MBps;
+   int tmp;
 
adev->shutdown = false;
adev->flags = flags;
@@ -3799,7 +3800,10 @@ int amdgpu_device_init(struct amdgpu_device *adev,
}
}
} else {
+   tmp = amdgpu_reset_method;
+   amdgpu_reset_method = AMD_RESET_METHOD_NONE;
r = amdgpu_asic_reset(adev);
+   amdgpu_reset_method = tmp;
if (r) {
dev_err(adev->dev, "asic reset on init 
failed\n");
goto failed;
--
2.34.1

RE: [PATCH 1/2] drm/amdgpu: Fix mode2 reset for sienna cichlid

2023-04-23 Thread Xu, Feifei

[AMD Official Use Only - General]



Reviewed-by: Feifei Xu 

-Original Message-
From: lyndonli  
Sent: Monday, April 24, 2023 9:58 AM
To: amd-gfx@lists.freedesktop.org
Cc: Liu, Shaoyun ; Zhao, Victor ; 
Feng, Kenneth ; Xu, Feifei ; Li, 
Yunxiang (Teddy) ; Li, Lyndon 
Subject: [PATCH 1/2] drm/amdgpu: Fix mode2 reset for sienna cichlid

Before this change, sienna_cichlid_get_reset_handler will always return NULL, 
although the module parameter reset_method is 3 when loading amdgpu driver.

Signed-off-by: lyndonli 
Signed-off-by: Yunxiang Li 
---
 drivers/gpu/drm/amd/amdgpu/sienna_cichlid.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/sienna_cichlid.c 
b/drivers/gpu/drm/amd/amdgpu/sienna_cichlid.c
index 81a6d5b94987..8b8086d5c864 100644
--- a/drivers/gpu/drm/amd/amdgpu/sienna_cichlid.c
+++ b/drivers/gpu/drm/amd/amdgpu/sienna_cichlid.c
@@ -40,7 +40,7 @@ static bool sienna_cichlid_is_mode2_default(struct 
amdgpu_reset_control *reset_c
adev->pm.fw_version >= 0x3a5500 && !amdgpu_sriov_vf(adev))
return true;
 #endif
-   return false;
+   return amdgpu_reset_method == AMD_RESET_METHOD_MODE2;
 }
 
 static struct amdgpu_reset_handler *
--
2.34.1

RE: [PATCH 2/2] drm/amdgpu: Use the default reset when loading amdgpu driver

2023-04-24 Thread Xu, Feifei

[AMD Official Use Only - General]

Most asic using mode1 as default, some not. Could check here: 
soc21_asic_reset_method()/nv_asic_reset_method()
Patch is using default reset method. 

Thanks,
Feifei

-Original Message-
From: Li, Lyndon  
Sent: Monday, April 24, 2023 4:01 PM
To: Xu, Feifei ; amd-gfx@lists.freedesktop.org
Cc: Liu, Shaoyun ; Zhao, Victor ; 
Feng, Kenneth ; Li, Yunxiang (Teddy) 
Subject: RE: [PATCH 2/2] drm/amdgpu: Use the default reset when loading amdgpu 
driver

[AMD Official Use Only - General]

Hi Feifei,

Thanks for your feedback. Will add comments inside and modify commit messages.
I think you are a little misunderstood.

It should do a mode1 reset when loading or reloading the driver, regardless of 
the module parameter reset_method. 
It will call amdgpu_device_mode1_reset in amdgpu_asic_reset if 
amdgpu_reset_method is set to AMD_RESET_METHOD_NONE. 
Here's an example,
modprobe amdgpu
modprobe -r amdgpu
modprobe amdgpu reset_method=3 //The real reset method should be mode1 reset, 
since it is initialization.

Regards,
Lyndon

> -Original Message-
> From: Xu, Feifei 
> Sent: Monday, April 24, 2023 2:00 PM
> To: Li, Lyndon ; amd-gfx@lists.freedesktop.org
> Cc: Liu, Shaoyun ; Zhao, Victor 
> ; Feng, Kenneth ; Li, 
> Yunxiang (Teddy) ; Li, Lyndon 
> Subject: RE: [PATCH 2/2] drm/amdgpu: Use the default reset when 
> loading amdgpu driver
> 
> [AMD Official Use Only - General]
> 
> I think you might be refer to : mod parameter reset_method will not 
> affect the loading driver code path. If loading driver, it should use 
> the default reset which might be mode1/mode2 or BACO instead of the specific 
> mode2.
> 
> With the confusing commit msg corrected. And adding comment before the 
> code " r = amdgpu_asic_reset(adev);"
> 
> 
> 
> Reviewed-by: Feifei Xu 
> 
> -Original Message-
> From: lyndonli 
> Sent: Monday, April 24, 2023 9:58 AM
> To: amd-gfx@lists.freedesktop.org
> Cc: Liu, Shaoyun ; Zhao, Victor 
> ; Feng, Kenneth ; Xu, 
> Feifei ; Li, Yunxiang (Teddy) 
> ; Li, Lyndon 
> Subject: [PATCH 2/2] drm/amdgpu: Use the default reset when loading 
> amdgpu driver
> 
> Below call trace and errors are observed when reloading amdgpu driver 
> with the module parameter reset_method=3.
> 
> It should do a mode1 reset when loading the driver.
> 
> [  +2.180243] [drm] psp gfx command ID_LOAD_TOC(0x20) failed and 
> response status is (0x0) [  +0.11] [drm:psp_hw_start [amdgpu]]
> *ERROR* Failed to load toc [  +0.000890] [drm:psp_hw_start [amdgpu]]
> *ERROR* PSP tmr init failed!
> [  +0.020683] [drm:amdgpu_fill_buffer [amdgpu]] *ERROR* Trying to 
> clear memory with ring turned off.
> [  +0.03] RIP: 0010:amdgpu_bo_release_notify+0x1ef/0x210 [amdgpu] 
> [  +0.04] Call Trace:
> [  +0.03]  
> [  +0.08]  ttm_bo_release+0x2c4/0x330 [amdttm] [  +0.26]
> amdttm_bo_put+0x3c/0x70 [amdttm] [  +0.20]
> amdgpu_bo_free_kernel+0xe6/0x140 [amdgpu] [  +0.000728]
> psp_v11_0_ring_destroy+0x34/0x60 [amdgpu] [  +0.000826]
> psp_hw_init+0xe7/0x2f0 [amdgpu] [  +0.000813]
> amdgpu_device_fw_loading+0x1ad/0x2d0 [amdgpu] [  +0.000731]
> amdgpu_device_init.cold+0x108e/0x2002 [amdgpu] [  +0.001071]  ?
> do_pci_enable_device+0xe1/0x110 [  +0.11]
> amdgpu_driver_load_kms+0x1a/0x160 [amdgpu] [  +0.000729]
> amdgpu_pci_probe+0x179/0x3a0 [amdgpu]
> 
> Signed-off-by: lyndonli 
> Signed-off-by: Yunxiang Li 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 4 
>  1 file changed, 4 insertions(+)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index e536886f6d42..9738e3660cf1 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -3578,6 +3578,7 @@ int amdgpu_device_init(struct amdgpu_device 
> *adev,
>   int r, i;
>   bool px = false;
>   u32 max_MBps;
> + int tmp;
> 
>   adev->shutdown = false;
>   adev->flags = flags;
> @@ -3799,7 +3800,10 @@ int amdgpu_device_init(struct amdgpu_device 
> *adev,
>   }
>   }
>   } else {
> + tmp = amdgpu_reset_method;
> + amdgpu_reset_method =
> AMD_RESET_METHOD_NONE;
>   r = amdgpu_asic_reset(adev);
> + amdgpu_reset_method = tmp;
>   if (r) {
>   dev_err(adev->dev, "asic reset on init 
> failed\n");
>   goto failed;
> --
> 2.34.1

RE: [PATCH v3] drm/amdgpu: drop gfx_v11_0_cp_ecc_error_irq_funcs

2023-04-26 Thread Xu, Feifei

[AMD Official Use Only - General]



Reviewed-by: Feifei Xu 

-Original Message-
From: Horatio Zhang  
Sent: Wednesday, April 26, 2023 4:41 PM
To: Zhang, Hawking ; Koenig, Christian 
; Chen, Guchun ; 
amd-gfx@lists.freedesktop.org
Cc: Xu, Feifei ; Yao, Longlong ; 
Zhang, Horatio ; Zhang, Hawking ; 
Chen, Guchun 
Subject: [PATCH v3] drm/amdgpu: drop gfx_v11_0_cp_ecc_error_irq_funcs

The gfx.cp_ecc_error_irq is retired in gfx11. In gfx_v11_0_hw_fini still use 
amdgpu_irq_put to disable this interrupt, which caused the call trace in this 
function.

[  102.873958] Call Trace:
[  102.873959]  
[  102.873961]  gfx_v11_0_hw_fini+0x23/0x1e0 [amdgpu] [  102.874019]  
gfx_v11_0_suspend+0xe/0x20 [amdgpu] [  102.874072]  
amdgpu_device_ip_suspend_phase2+0x240/0x460 [amdgpu] [  102.874122]  
amdgpu_device_ip_suspend+0x3d/0x80 [amdgpu] [  102.874172]  
amdgpu_device_pre_asic_reset+0xd9/0x490 [amdgpu] [  102.874223]  
amdgpu_device_gpu_recover.cold+0x548/0xce6 [amdgpu] [  102.874321]  
amdgpu_debugfs_reset_work+0x4c/0x70 [amdgpu] [  102.874375]  
process_one_work+0x21f/0x3f0 [  102.874377]  worker_thread+0x200/0x3e0 [  
102.874378]  ? process_one_work+0x3f0/0x3f0 [  102.874379]  kthread+0xfd/0x130 
[  102.874380]  ? kthread_complete_and_exit+0x20/0x20
[  102.874381]  ret_from_fork+0x22/0x30

v2:
- Handle umc and gfx ras cases in separated patch
- Retired the gfx_v11_0_cp_ecc_error_irq_funcs in gfx11

v3:
- Improve the subject and code comments
- Add judgment on gfx11 in the function of amdgpu_gfx_ras_late_init

Signed-off-by: Horatio Zhang 
Reviewed-by: Hawking Zhang 
Acked-by: Christian König 
Reviewed-by: Guchun Chen 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c |  8 --  
drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c  | 38 -
 2 files changed, 5 insertions(+), 41 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
index 60bb4bba1994..5e69eec4b754 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
@@ -719,9 +719,11 @@ int amdgpu_gfx_ras_late_init(struct amdgpu_device *adev, 
struct ras_common_if *r
if (r)
return r;
 
-   r = amdgpu_irq_get(adev, &adev->gfx.cp_ecc_error_irq, 0);
-   if (r)
-   goto late_fini;
+   if (!(adev->ip_versions[GC_HWIP][0] == IP_VERSION(11, 0, 3))){
+   r = amdgpu_irq_get(adev, &adev->gfx.cp_ecc_error_irq, 
0);
+   if (r)
+   goto late_fini;
+   }
} else {
amdgpu_ras_feature_enable_on_boot(adev, ras_block, 0);
}
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c
index 8a4c4769e607..e9491aec3cae 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c
@@ -1355,13 +1355,6 @@ static int gfx_v11_0_sw_init(void *handle)
if (r)
return r;
 
-   /* ECC error */
-   r = amdgpu_irq_add_id(adev, SOC21_IH_CLIENTID_GRBM_CP,
- GFX_11_0_0__SRCID__CP_ECC_ERROR,
- &adev->gfx.cp_ecc_error_irq);
-   if (r)
-   return r;
-
/* FED error */
r = amdgpu_irq_add_id(adev, SOC21_IH_CLIENTID_GFX,
  GFX_11_0_0__SRCID__RLC_GC_FED_INTERRUPT,
@@ -4483,7 +4476,6 @@ static int gfx_v11_0_hw_fini(void *handle)
struct amdgpu_device *adev = (struct amdgpu_device *)handle;
int r;
 
-   amdgpu_irq_put(adev, &adev->gfx.cp_ecc_error_irq, 0);
amdgpu_irq_put(adev, &adev->gfx.priv_reg_irq, 0);
amdgpu_irq_put(adev, &adev->gfx.priv_inst_irq, 0);
 
@@ -5970,28 +5962,6 @@ static void 
gfx_v11_0_set_compute_eop_interrupt_state(struct amdgpu_device *adev
WREG32_SOC15_IP(GC, reg_addr, tmp); \
} while (0)
 
-static int gfx_v11_0_set_cp_ecc_error_state(struct amdgpu_device *adev,
-   struct amdgpu_irq_src 
*source,
-   unsigned type,
-   enum 
amdgpu_interrupt_state state)
-{
-   uint32_t ecc_irq_state = 0;
-   uint32_t pipe0_int_cntl_addr = 0;
-   int i = 0;
-
-   ecc_irq_state = (state == AMDGPU_IRQ_STATE_ENABLE) ? 1 : 0;
-
-   pipe0_int_cntl_addr = SOC15_REG_OFFSET(GC, 0, regCP_ME1_PIPE0_INT_CNTL);
-
-   WREG32_FIELD15_PREREG(GC, 0, CP_INT_CNTL_RING0, 
CP_ECC_ERROR_INT_ENABLE, ecc_irq_state);
-
-   for (i = 0; i < adev->gfx.mec.num_pipe_per_mec; i++)
-   SET_ECC_ME_PIPE_STATE(pipe0_int_cntl_addr + i * 
CP_ME1_PIPE_INST_ADDR_INTERVAL,
-   ecc_irq_state);
-
-   return 0;
-}
-
 static int gfx_v11_0

RE: [PATCH 1/2] drm/amdgpu: fix vga_set_state NULL pointer issue

2023-05-21 Thread Xu, Feifei

[AMD Official Use Only - General]

Reviewed-by: Feifei Xu 



-Original Message-
From: amd-gfx  On Behalf Of Gao, Likun
Sent: Friday, May 19, 2023 7:17 PM
To: amd-gfx list 
Cc: Zhang, Hawking 
Subject: FW: [PATCH 1/2] drm/amdgpu: fix vga_set_state NULL pointer issue

[AMD Official Use Only - General]

[AMD Official Use Only - General]

From: Likun Gao 

Fix NULL pointer issue for vga_set_state function as not all the ASIC need this 
operation.

Signed-off-by: Likun Gao 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu.h | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index ce196badf42d..5af954abd5ba 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -1260,7 +1260,8 @@ int emu_soc_asic_init(struct amdgpu_device *adev);
 /*
  * ASICs macro.
  */
-#define amdgpu_asic_set_vga_state(adev, state) 
(adev)->asic_funcs->set_vga_state((adev), (state))
+#define amdgpu_asic_set_vga_state(adev, state) \
+((adev)->asic_funcs->set_vga_state ?
+(adev)->asic_funcs->set_vga_state((adev), (state)) : 0)
 #define amdgpu_asic_reset(adev) (adev)->asic_funcs->reset((adev))  #define 
amdgpu_asic_reset_method(adev) (adev)->asic_funcs->reset_method((adev))
 #define amdgpu_asic_get_xclk(adev) (adev)->asic_funcs->get_xclk((adev))
--
2.34.1

RE: [PATCH] drm/amdgpu: fix Null pointer dereference error in amdgpu_device_recover_vram

2023-05-30 Thread Xu, Feifei

[AMD Official Use Only - General]

Acked-by: Feifei Xu 

-Original Message-
From: Horatio Zhang 
Sent: Tuesday, May 30, 2023 2:53 AM
To: amd-gfx@lists.freedesktop.org
Cc: Xu, Feifei ; Yao, Longlong ; 
Zhang, Horatio ; Pan, Xinhui 
Subject: [PATCH] drm/amdgpu: fix Null pointer dereference error in 
amdgpu_device_recover_vram

Use the function of amdgpu_bo_vm_destroy to handle the resource release of 
shadow bo. During the amdgpu_mes_self_test, shadow bo released, but
vmbo->shadow_list was not, which caused a null pointer reference error
in amdgpu_device_recover_vram when GPU reset.

Fixes: cd7050908070 ("drm/amdgpu: Fix vram recover doesn't work after whole GPU 
reset (v2)")
Signed-off-by: xinhui pan 
Signed-off-by: Horatio Zhang 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 10 --  
drivers/gpu/drm/amd/amdgpu/amdgpu_vm_pt.c  |  1 -
 2 files changed, 4 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
index 46f249912b67..4e46f8f1b3de 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
@@ -79,9 +79,10 @@ static void amdgpu_bo_user_destroy(struct ttm_buffer_object 
*tbo)  static void amdgpu_bo_vm_destroy(struct ttm_buffer_object *tbo)  {
struct amdgpu_device *adev = amdgpu_ttm_adev(tbo->bdev);
-   struct amdgpu_bo *bo = ttm_to_amdgpu_bo(tbo);
+   struct amdgpu_bo *shadow_bo = ttm_to_amdgpu_bo(tbo), *bo;
struct amdgpu_bo_vm *vmbo;

+   bo = shadow_bo->parent;
vmbo = to_amdgpu_bo_vm(bo);
/* in case amdgpu_device_recover_vram got NULL of bo->parent */
if (!list_empty(&vmbo->shadow_list)) { @@ -711,11 +712,6 @@ int 
amdgpu_bo_create_vm(struct amdgpu_device *adev,
return r;

*vmbo_ptr = to_amdgpu_bo_vm(bo_ptr);
-   INIT_LIST_HEAD(&(*vmbo_ptr)->shadow_list);
-   /* Set destroy callback to amdgpu_bo_vm_destroy after vmbo->shadow_list
-* is initialized.
-*/
-   bo_ptr->tbo.destroy = &amdgpu_bo_vm_destroy;
return r;
 }

@@ -732,6 +728,8 @@ void amdgpu_bo_add_to_shadow_list(struct amdgpu_bo_vm *vmbo)

mutex_lock(&adev->shadow_list_lock);
list_add_tail(&vmbo->shadow_list, &adev->shadow_list);
+   vmbo->shadow->parent = amdgpu_bo_ref(&vmbo->bo);
+   vmbo->shadow->tbo.destroy = &amdgpu_bo_vm_destroy;
mutex_unlock(&adev->shadow_list_lock);
 }

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_pt.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_pt.c
index cc3b1b596e56..dea1a64be44d 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_pt.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_pt.c
@@ -573,7 +573,6 @@ int amdgpu_vm_pt_create(struct amdgpu_device *adev, struct 
amdgpu_vm *vm,
return r;
}

-   (*vmbo)->shadow->parent = amdgpu_bo_ref(bo);
amdgpu_bo_add_to_shadow_list(*vmbo);

return 0;
--
2.34.1

RE: [PATCH] drm/amdgpu/mmsch: Correct the definition for mmsch init header

2023-06-06 Thread Xu, Feifei

[AMD Official Use Only - General]

Reviewed-by: Feifei Xu 

-Original Message-
From: amd-gfx  On Behalf Of Emily Deng
Sent: Tuesday, June 6, 2023 2:52 PM
To: amd-gfx@lists.freedesktop.org
Cc: Deng, Emily 
Subject: [PATCH] drm/amdgpu/mmsch: Correct the definition for mmsch init header

For the header, it is version related, shouldn't use MAX_VCN_INSTANCES.

Signed-off-by: Emily Deng 
---
 drivers/gpu/drm/amd/amdgpu/mmsch_v3_0.h | 4 +++-  
drivers/gpu/drm/amd/amdgpu/mmsch_v4_0.h | 4 +++-
 drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c   | 2 +-
 drivers/gpu/drm/amd/amdgpu/vcn_v4_0.c   | 2 +-
 4 files changed, 8 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/mmsch_v3_0.h 
b/drivers/gpu/drm/amd/amdgpu/mmsch_v3_0.h
index 3e4e858a6965..a773ef61b78c 100644
--- a/drivers/gpu/drm/amd/amdgpu/mmsch_v3_0.h
+++ b/drivers/gpu/drm/amd/amdgpu/mmsch_v3_0.h
@@ -30,6 +30,8 @@
 #define MMSCH_VERSION_MINOR0
 #define MMSCH_VERSION  (MMSCH_VERSION_MAJOR << 16 | MMSCH_VERSION_MINOR)

+#define MMSCH_V3_0_VCN_INSTANCES 0x2
+
 enum mmsch_v3_0_command_type {
MMSCH_COMMAND__DIRECT_REG_WRITE = 0,
MMSCH_COMMAND__DIRECT_REG_POLLING = 2, @@ -47,7 +49,7 @@ struct 
mmsch_v3_0_table_info {  struct mmsch_v3_0_init_header {
uint32_t version;
uint32_t total_size;
-   struct mmsch_v3_0_table_info inst[AMDGPU_MAX_VCN_INSTANCES];
+   struct mmsch_v3_0_table_info inst[MMSCH_V3_0_VCN_INSTANCES];
 };

 struct mmsch_v3_0_cmd_direct_reg_header { diff --git 
a/drivers/gpu/drm/amd/amdgpu/mmsch_v4_0.h 
b/drivers/gpu/drm/amd/amdgpu/mmsch_v4_0.h
index 83653a50a1a2..796d4f8791e5 100644
--- a/drivers/gpu/drm/amd/amdgpu/mmsch_v4_0.h
+++ b/drivers/gpu/drm/amd/amdgpu/mmsch_v4_0.h
@@ -43,6 +43,8 @@
 #define MMSCH_VF_MAILBOX_RESP__OK 0x1
 #define MMSCH_VF_MAILBOX_RESP__INCOMPLETE 0x2

+#define MMSCH_V4_0_VCN_INSTANCES 0x2
+
 enum mmsch_v4_0_command_type {
MMSCH_COMMAND__DIRECT_REG_WRITE = 0,
MMSCH_COMMAND__DIRECT_REG_POLLING = 2, @@ -60,7 +62,7 @@ struct 
mmsch_v4_0_table_info {  struct mmsch_v4_0_init_header {
uint32_t version;
uint32_t total_size;
-   struct mmsch_v4_0_table_info inst[AMDGPU_MAX_VCN_INSTANCES];
+   struct mmsch_v4_0_table_info inst[MMSCH_V4_0_VCN_INSTANCES];
struct mmsch_v4_0_table_info jpegdec;
 };

diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c 
b/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c
index 70fefbf26c48..c8f63b3c6f69 100644
--- a/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c
@@ -1313,7 +1313,7 @@ static int vcn_v3_0_start_sriov(struct amdgpu_device 
*adev)

header.version = MMSCH_VERSION;
header.total_size = sizeof(struct mmsch_v3_0_init_header) >> 2;
-   for (i = 0; i < AMDGPU_MAX_VCN_INSTANCES; i++) {
+   for (i = 0; i < MMSCH_V3_0_VCN_INSTANCES; i++) {
header.inst[i].init_status = 0;
header.inst[i].table_offset = 0;
header.inst[i].table_size = 0;
diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v4_0.c 
b/drivers/gpu/drm/amd/amdgpu/vcn_v4_0.c
index 60c3fd20e8ce..8d371faaa2b3 100644
--- a/drivers/gpu/drm/amd/amdgpu/vcn_v4_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/vcn_v4_0.c
@@ -1239,7 +1239,7 @@ static int vcn_v4_0_start_sriov(struct amdgpu_device 
*adev)

header.version = MMSCH_VERSION;
header.total_size = sizeof(struct mmsch_v4_0_init_header) >> 2;
-   for (i = 0; i < AMDGPU_MAX_VCN_INSTANCES; i++) {
+   for (i = 0; i < MMSCH_V4_0_VCN_INSTANCES; i++) {
header.inst[i].init_status = 0;
header.inst[i].table_offset = 0;
header.inst[i].table_size = 0;
--
2.36.1

RE: [PATCH] drm/amd/pm: workaround for compute workload type on some skus

2023-06-08 Thread Xu, Feifei

[AMD Official Use Only - General]

Reviewed-by: Feifei Xu 

-Original Message-
From: amd-gfx  On Behalf Of Kenneth Feng
Sent: Friday, June 9, 2023 10:56 AM
To: amd-gfx@lists.freedesktop.org
Cc: Quan, Evan ; Feng, Kenneth 
Subject: [PATCH] drm/amd/pm: workaround for compute workload type on some skus

On smu 13.0.0, the compute workload type cannot be set on all the skus due to 
some other problems. This workaround is to make sure compute workload type can 
also run on some specific skus.

Signed-off-by: Kenneth Feng 
---
 .../drm/amd/pm/swsmu/smu13/smu_v13_0_0_ppt.c  | 26 +++
 1 file changed, 26 insertions(+)

diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_0_ppt.c 
b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_0_ppt.c
index e2265f50bacc..6e8acd021ee6 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_0_ppt.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_0_ppt.c
@@ -2179,6 +2179,32 @@ static int smu_v13_0_0_set_power_profile_mode(struct 
smu_context *smu,
}
}

+   if (smu->power_profile_mode == PP_SMC_POWER_PROFILE_COMPUTE &&
+   (((smu->adev->pdev->device == 0x744C) && 
(smu->adev->pdev->revision == 0xC8)) ||
+   ((smu->adev->pdev->device == 0x744C) && 
(smu->adev->pdev->revision == 0xCC {
+   ret = smu_cmn_update_table(smu,
+  SMU_TABLE_ACTIVITY_MONITOR_COEFF,
+  WORKLOAD_PPLIB_COMPUTE_BIT,
+  (void *)(&activity_monitor_external),
+  false);
+   if (ret) {
+   dev_err(smu->adev->dev, "[%s] Failed to get activity 
monitor!", __func__);
+   return ret;
+   }
+
+   ret = smu_cmn_update_table(smu,
+  SMU_TABLE_ACTIVITY_MONITOR_COEFF,
+  WORKLOAD_PPLIB_CUSTOM_BIT,
+  (void *)(&activity_monitor_external),
+  true);
+   if (ret) {
+   dev_err(smu->adev->dev, "[%s] Failed to set activity 
monitor!", __func__);
+   return ret;
+   }
+
+   smu->power_profile_mode = PP_SMC_POWER_PROFILE_CUSTOM;
+   }
+
/* conv PP_SMC_POWER_PROFILE* to WORKLOAD_PPLIB_*_BIT */
workload_type = smu_cmn_to_asic_specific_index(smu,
   
CMN2ASIC_MAPPING_WORKLOAD,
--
2.34.1

RE: [PATCH] drm/amdgpu: skip vram reserve on firmware_v2_2 for bare-metal

2022-11-23 Thread Xu, Feifei

[AMD Official Use Only - General]



Reviewed-by: Feifei Xu 

-Original Message-
From: Gao, Likun  
Sent: Wednesday, November 23, 2022 6:01 PM
To: amd-gfx list 
Cc: Zhang, Hawking ; Xu, Feifei 
Subject: [PATCH] drm/amdgpu: skip vram reserve on firmware_v2_2 for bare-metal

[AMD Official Use Only - General]

vram_usagebyfirmware v2_2 is only used in SRIOV case, skip the related settings 
in bare-metal case currently.

Signed-off-by: Likun Gao 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_atomfirmware.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_atomfirmware.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_atomfirmware.c
index 9b97fa39d47a..d824ebe1d4d0 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_atomfirmware.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_atomfirmware.c
@@ -147,14 +147,16 @@ static int amdgpu_atomfirmware_allocate_fb_v2_2(struct 
amdgpu_device *adev,
  drv_start_addr,
  drv_size);
 
-   if ((fw_start_addr & (ATOM_VRAM_BLOCK_NEEDS_NO_RESERVATION << 30)) == 
0) {
+   if (amdgpu_sriov_vf(adev) &&
+   (fw_start_addr & (ATOM_VRAM_BLOCK_NEEDS_NO_RESERVATION << 30)) == 
+0) {
/* Firmware request VRAM reservation for SR-IOV */
adev->mman.fw_vram_usage_start_offset = (fw_start_addr &
(~ATOM_VRAM_OPERATION_FLAGS_MASK)) << 10;
adev->mman.fw_vram_usage_size = fw_size << 10;
}
 
-   if ((drv_start_addr & (ATOM_VRAM_BLOCK_NEEDS_NO_RESERVATION << 30)) == 
0) {
+   if (amdgpu_sriov_vf(adev) &&
+   (drv_start_addr & (ATOM_VRAM_BLOCK_NEEDS_NO_RESERVATION << 30)) == 
+0) {
/* driver request VRAM reservation for SR-IOV */
adev->mman.drv_vram_usage_start_offset = (drv_start_addr &
(~ATOM_VRAM_OPERATION_FLAGS_MASK)) << 10;
--
2.25.1
<>

RE: [PATCH] drm/amd/pm: add missing AllowIHInterrupt message mapping for SMU13.0.0

2023-01-19 Thread Xu, Feifei

[AMD Official Use Only - General]



Reviewed-by: Feifei Xu 



-Original Message-
From: amd-gfx  On Behalf Of Evan Quan
Sent: Friday, January 20, 2023 11:28 AM
To: amd-gfx@lists.freedesktop.org
Cc: Deucher, Alexander ; Quan, Evan 

Subject: [PATCH] drm/amd/pm: add missing AllowIHInterrupt message mapping for 
SMU13.0.0

Add SMU13.0.0 AllowIHInterrupt message mapping.

Signed-off-by: Evan Quan 
Change-Id: Ief5f12215572a8029970e79814495e67d20f2388
---
 drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_0_ppt.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_0_ppt.c 
b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_0_ppt.c
index 3fded9d2c20a..5ab303760714 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_0_ppt.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_0_ppt.c
@@ -145,6 +145,7 @@ static struct cmn2asic_msg_mapping 
smu_v13_0_0_message_map[SMU_MSG_MAX_COUNT] =
MSG_MAP(SetBadMemoryPagesRetiredFlagsPerChannel,
PPSMC_MSG_SetBadMemoryPagesRetiredFlagsPerChannel,  
 0),
MSG_MAP(AllowGpo,   PPSMC_MSG_SetGpoAllow,  
 0),
+   MSG_MAP(AllowIHHostInterrupt,   PPSMC_MSG_AllowIHHostInterrupt, 
  0),
 };
 
 static struct cmn2asic_mapping smu_v13_0_0_clk_map[SMU_CLK_COUNT] = {
-- 
2.34.1

RE: [PATCH] drm/amdgpu: enable HDP SD for gfx 11.0.3

2023-01-28 Thread Xu, Feifei

[AMD Official Use Only - General]



Reviewed-by: Feifei Xu 



-Original Message-
From: amd-gfx  On Behalf Of Evan Quan
Sent: Saturday, January 28, 2023 4:06 PM
To: amd-gfx@lists.freedesktop.org
Cc: Deucher, Alexander ; Quan, Evan 

Subject: [PATCH] drm/amdgpu: enable HDP SD for gfx 11.0.3

Enable HDP clock gating control for gfx 11.0.3.

Signed-off-by: Evan Quan 
Change-Id: I0bac85a05692937917e2916e79e6e74a1e11aec0
---
 drivers/gpu/drm/amd/amdgpu/soc21.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/soc21.c 
b/drivers/gpu/drm/amd/amdgpu/soc21.c
index e03cf7f766c5..477be4b62bc3 100644
--- a/drivers/gpu/drm/amd/amdgpu/soc21.c
+++ b/drivers/gpu/drm/amd/amdgpu/soc21.c
@@ -676,7 +676,8 @@ static int soc21_common_early_init(void *handle)
AMD_CG_SUPPORT_GFX_CGCG |
AMD_CG_SUPPORT_GFX_CGLS |
AMD_CG_SUPPORT_REPEATER_FGCG |
-   AMD_CG_SUPPORT_GFX_MGCG;
+   AMD_CG_SUPPORT_GFX_MGCG |
+   AMD_CG_SUPPORT_HDP_SD;
adev->pg_flags = AMD_PG_SUPPORT_VCN |
AMD_PG_SUPPORT_VCN_DPG |
AMD_PG_SUPPORT_JPEG;
-- 
2.34.1

RE: [PATCH] drm/amd/pm: add unique_id for gc 11.0.3

2023-08-10 Thread Xu, Feifei

[AMD Official Use Only - General]

Reviewed-by: Feifei Xu 

-Original Message-
From: amd-gfx  On Behalf Of Kenneth Feng
Sent: Friday, August 11, 2023 12:28 PM
To: amd-gfx@lists.freedesktop.org
Cc: Feng, Kenneth 
Subject: [PATCH] drm/amd/pm: add unique_id for gc 11.0.3

drm/amd/pm: add unique_id for gc 11.0.3

Signed-off-by: Kenneth Feng 
---
 drivers/gpu/drm/amd/pm/amdgpu_pm.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/amd/pm/amdgpu_pm.c 
b/drivers/gpu/drm/amd/pm/amdgpu_pm.c
index 5aed023f7402..c69701da94ea 100644
--- a/drivers/gpu/drm/amd/pm/amdgpu_pm.c
+++ b/drivers/gpu/drm/amd/pm/amdgpu_pm.c
@@ -2076,6 +2076,7 @@ static int default_attr_update(struct amdgpu_device 
*adev, struct amdgpu_device_
case IP_VERSION(11, 0, 0):
case IP_VERSION(11, 0, 1):
case IP_VERSION(11, 0, 2):
+   case IP_VERSION(11, 0, 3):
*states = ATTR_STATE_SUPPORTED;
break;
default:
--
2.34.1

Re: [PATCH 2/2] drm/amdgpu/psp11: fix typo in comment

2019-10-18 Thread Xu, Feifei

Series is reviewed by Feifei Xu 


> 在 2019年10月18日，18:59，Yuan, Xiaojie  写道：
> 
> Signed-off-by: Xiaojie Yuan 
> ---
> drivers/gpu/drm/amd/amdgpu/psp_v11_0.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c 
> b/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
> index dfe85a1d79a5..4eb5bacb55f7 100644
> --- a/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
> @@ -232,7 +232,7 @@ static int psp_v11_0_bootloader_load_kdb(struct 
> psp_context *psp)
>/* Copy PSP KDB binary to memory */
>memcpy(psp->fw_pri_buf, psp->kdb_start_addr, psp->kdb_bin_size);
> 
> -/* Provide the sys driver to bootloader */
> +/* Provide the PSP KDB to bootloader */
>WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_36,
>   (uint32_t)(psp->fw_pri_mc_addr >> 20));
>psp_gfxdrv_command_reg = PSP_BL__LOAD_KEY_DATABASE;
> -- 
> 2.20.1
> 
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

RE: [PATCH 1/3] drm/amd/powerplay: add lock protection for swSMU APIs V2

2019-10-19 Thread Xu, Feifei

Acked-by: Feifei Xu 

Thanks,
Feifei

-Original Message-
From: amd-gfx  On Behalf Of Quan, Evan
Sent: Friday, October 18, 2019 10:57 PM
To: amd-gfx@lists.freedesktop.org
Cc: Deucher, Alexander ; Grodzovsky, Andrey 
; Quan, Evan 
Subject: [PATCH 1/3] drm/amd/powerplay: add lock protection for swSMU APIs V2

This is a quick and low risk fix. Those APIs which
are exposed to other IPs or to support sysfs/hwmon
interfaces or DAL will have lock protection. Meanwhile
no lock protection is enforced for swSMU internal used
APIs. Future optimization is needed.

V2: strip the lock protection for all swSMU internal APIs

Change-Id: I8392652c9da1574a85acd9b171f04380f3630852
Signed-off-by: Evan Quan 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_dpm.c   |   6 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_dpm.h   |   6 -
 drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c|  23 +-
 .../amd/display/amdgpu_dm/amdgpu_dm_pp_smu.c  |   4 +-
 drivers/gpu/drm/amd/powerplay/amdgpu_smu.c| 696 --
 drivers/gpu/drm/amd/powerplay/arcturus_ppt.c  |   3 -
 .../gpu/drm/amd/powerplay/inc/amdgpu_smu.h| 163 ++--
 drivers/gpu/drm/amd/powerplay/navi10_ppt.c|  15 +-
 drivers/gpu/drm/amd/powerplay/renoir_ppt.c|  14 +-
 drivers/gpu/drm/amd/powerplay/smu_v11_0.c |  22 +-
 drivers/gpu/drm/amd/powerplay/smu_v12_0.c |   3 -
 drivers/gpu/drm/amd/powerplay/vega20_ppt.c|  20 +-
 12 files changed, 777 insertions(+), 198 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_dpm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_dpm.c
index 263265245e19..28d32725285b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_dpm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_dpm.c
@@ -912,7 +912,8 @@ int amdgpu_dpm_get_sclk(struct amdgpu_device *adev, bool 
low)
if (is_support_sw_smu(adev)) {
ret = smu_get_dpm_freq_range(&adev->smu, SMU_GFXCLK,
 low ? &clk_freq : NULL,
-!low ? &clk_freq : NULL);
+!low ? &clk_freq : NULL,
+true);
if (ret)
return 0;
return clk_freq * 100;
@@ -930,7 +931,8 @@ int amdgpu_dpm_get_mclk(struct amdgpu_device *adev, bool 
low)
if (is_support_sw_smu(adev)) {
ret = smu_get_dpm_freq_range(&adev->smu, SMU_UCLK,
 low ? &clk_freq : NULL,
-!low ? &clk_freq : NULL);
+!low ? &clk_freq : NULL,
+true);
if (ret)
return 0;
return clk_freq * 100;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_dpm.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_dpm.h
index 1c5c0fd76dbf..2cfb677272af 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_dpm.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_dpm.h
@@ -298,12 +298,6 @@ enum amdgpu_pcie_gen {
 #define amdgpu_dpm_get_current_power_state(adev) \

((adev)->powerplay.pp_funcs->get_current_power_state((adev)->powerplay.pp_handle))
 
-#define amdgpu_smu_get_current_power_state(adev) \
-   ((adev)->smu.ppt_funcs->get_current_power_state(&((adev)->smu)))
-
-#define amdgpu_smu_set_power_state(adev) \
-   ((adev)->smu.ppt_funcs->set_power_state(&((adev)->smu)))
-
 #define amdgpu_dpm_get_pp_num_states(adev, data) \

((adev)->powerplay.pp_funcs->get_pp_num_states((adev)->powerplay.pp_handle, 
data))
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c
index c50d5f1e75e5..36f36b35000d 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c
@@ -211,7 +211,7 @@ static ssize_t amdgpu_get_dpm_state(struct device *dev,
 
if (is_support_sw_smu(adev)) {
if (adev->smu.ppt_funcs->get_current_power_state)
-   pm = amdgpu_smu_get_current_power_state(adev);
+   pm = smu_get_current_power_state(&adev->smu);
else
pm = adev->pm.dpm.user_state;
} else if (adev->powerplay.pp_funcs->get_current_power_state) {
@@ -957,7 +957,7 @@ static ssize_t amdgpu_set_pp_dpm_sclk(struct device *dev,
return ret;
 
if (is_support_sw_smu(adev))
-   ret = smu_force_clk_levels(&adev->smu, SMU_SCLK, mask);
+   ret = smu_force_clk_levels(&adev->smu, SMU_SCLK, mask, true);
else if (adev->powerplay.pp_funcs->force_clock_level)
ret = amdgpu_dpm_force_clock_level(adev, PP_SCLK, mask);
 
@@ -1004,7 +1004,7 @@ static ssize_t amdgpu_set_pp_dpm_mclk(struct device *dev,
return ret;
 
if (is_support_sw_smu(adev))
-   ret = smu_force_clk_levels(&adev->smu, SMU_MCLK, mask);
+   ret = smu_force_clk_lev

RE: [PATCH 2/3] drm/amdgpu/gfx10: update gfx golden settings for navi14

2019-10-24 Thread Xu, Feifei

Series is reviewed-by: Feifei Xu 

Thanks,
Feifei

-Original Message-
From: amd-gfx  On Behalf Of Tianci Yin
Sent: Thursday, October 24, 2019 6:10 PM
To: amd-gfx@lists.freedesktop.org
Cc: Xu, Feifei ; Xiao, Jack ; Yuan, 
Xiaojie ; Yin, Tianci (Rico) ; Zhang, 
Hawking 
Subject: [PATCH 2/3] drm/amdgpu/gfx10: update gfx golden settings for navi14

From: "Tianci.Yin" 

update registers: mmCGTT_SPI_CLK_CTRL

Change-Id: Ib2539aae1fb0d001278b7f89c90ad6296f9fb85f
Signed-off-by: Tianci.Yin 
---
 drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
index 11e863c4c40b..22d0fade9c71 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
@@ -140,7 +140,7 @@ static const struct soc15_reg_golden 
golden_settings_gc_10_1_1[] =
SOC15_REG_GOLDEN_VALUE(GC, 0, mmCB_HW_CONTROL_4, 0x, 
0x003c0014),
SOC15_REG_GOLDEN_VALUE(GC, 0, mmCGTT_GS_NGG_CLK_CTRL, 0x8fff, 
0x8100),
SOC15_REG_GOLDEN_VALUE(GC, 0, mmCGTT_IA_CLK_CTRL, 0x0fff, 
0x0100),
-   SOC15_REG_GOLDEN_VALUE(GC, 0, mmCGTT_SPI_CLK_CTRL, 0xc000, 
0xc100),
+   SOC15_REG_GOLDEN_VALUE(GC, 0, mmCGTT_SPI_CLK_CTRL, 0xcd00, 
0x0d000100),
SOC15_REG_GOLDEN_VALUE(GC, 0, mmCGTT_SQ_CLK_CTRL, 0xf8ff0fff, 
0x6100),
SOC15_REG_GOLDEN_VALUE(GC, 0, mmCGTT_SQG_CLK_CTRL, 0x4ff0, 
0x4100),
SOC15_REG_GOLDEN_VALUE(GC, 0, mmCGTT_VGT_CLK_CTRL, 0x8fff, 
0x8100),
-- 
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

RE: [PATCH 1/2] drm/amdgpu: fix possible pstate switch race condition

2019-11-05 Thread Xu, Feifei



Series is Reviewed-by: Feifei Xu 

-Original Message-
From: amd-gfx  On Behalf Of Quan, Evan
Sent: 2019年11月5日 18:24
To: amd-gfx@lists.freedesktop.org
Cc: Strawbridge, Michael ; Kim, Jonathan 
; Quan, Evan 
Subject: [PATCH 1/2] drm/amdgpu: fix possible pstate switch race condition

Added lock protection so that the p-state switch will be guarded to be 
sequential. Also update the hive pstate only all device from the hive are in 
the same state.

Change-Id: I165a6f44e8aec1e6da56eefa0fc49d36670e56fe
Signed-off-by: Evan Quan 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu.h  |  3 +++
 drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c | 34 ++--
 2 files changed, 35 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index 0469cc51a6fb..41cf2abd6209 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -1041,6 +1041,9 @@ struct amdgpu_device {
 
uint64_tunique_id;
uint64_tdf_perfmon_config_assign_mask[AMDGPU_MAX_DF_PERFMONS];
+
+   /* device pstate */
+   int pstate;
 };
 
 static inline struct amdgpu_device *amdgpu_ttm_adev(struct ttm_bo_device 
*bdev) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c
index 167d9fbd2c4f..de20a9a1c444 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c
@@ -274,12 +274,18 @@ int amdgpu_xgmi_set_pstate(struct amdgpu_device *adev, 
int pstate)  {
int ret = 0;
struct amdgpu_hive_info *hive = amdgpu_get_xgmi_hive(adev, 0);
+   struct amdgpu_device *tmp_adev;
+   bool update_hive_pstate = true;
 
if (!hive)
return 0;
 
-   if (hive->pstate == pstate)
+   mutex_lock(&hive->hive_lock);
+
+   if (hive->pstate == pstate) {
+   mutex_unlock(&hive->hive_lock);
return 0;
+   }
 
dev_dbg(adev->dev, "Set xgmi pstate %d.\n", pstate);
 
@@ -290,11 +296,32 @@ int amdgpu_xgmi_set_pstate(struct amdgpu_device *adev, 
int pstate)
ret = 
adev->powerplay.pp_funcs->set_xgmi_pstate(adev->powerplay.pp_handle,
pstate);
 
-   if (ret)
+   if (ret) {
dev_err(adev->dev,
"XGMI: Set pstate failure on device %llx, hive %llx, 
ret %d",
adev->gmc.xgmi.node_id,
adev->gmc.xgmi.hive_id, ret);
+   goto out;
+   }
+
+   /* Update device pstate */
+   adev->pstate = pstate;
+
+   /*
+* Update the hive pstate only all devices of the hive
+* are in the same pstate
+*/
+   list_for_each_entry(tmp_adev, &hive->device_list, gmc.xgmi.head) {
+   if (tmp_adev->pstate != adev->pstate) {
+   update_hive_pstate = false;
+   break;
+   }
+   }
+   if (update_hive_pstate)
+   hive->pstate = pstate;
+
+out:
+   mutex_unlock(&hive->hive_lock);
 
return ret;
 }
@@ -369,6 +396,9 @@ int amdgpu_xgmi_add_device(struct amdgpu_device *adev)
goto exit;
}
 
+   /* Set default device pstate */
+   adev->pstate = -1;
+
top_info = &adev->psp.xgmi_context.top_info;
 
list_add_tail(&adev->gmc.xgmi.head, &hive->device_list);
--
2.23.0

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

RE: [PATCH] drm/amd/powerplay: fix deadlock on setting power_dpm_force_performance_level

2019-11-05 Thread Xu, Feifei



Reviewed-by: Feifei Xu 

-Original Message-
From: amd-gfx  On Behalf Of Quan, Evan
Sent: Wednesday, November 6, 2019 2:57 PM
To: amd-gfx@lists.freedesktop.org
Cc: Quan, Evan 
Subject: [PATCH] drm/amd/powerplay: fix deadlock on setting 
power_dpm_force_performance_level

smu_enable_umd_pstate() will try to get the smu->mutex which was already hold 
by its parent API smu_force_performance_level() on the call path.
Thus deadlock happens.

Change-Id: Ic4d3c7d06eb47eab2ea42b98f399cd95ab320f0c
Signed-off-by: Evan Quan 
---
 drivers/gpu/drm/amd/powerplay/amdgpu_smu.c | 19 ++-
 1 file changed, 14 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/amd/powerplay/amdgpu_smu.c 
b/drivers/gpu/drm/amd/powerplay/amdgpu_smu.c
index facc19cae7e5..c21fe7ac5df8 100644
--- a/drivers/gpu/drm/amd/powerplay/amdgpu_smu.c
+++ b/drivers/gpu/drm/amd/powerplay/amdgpu_smu.c
@@ -383,14 +383,25 @@ bool smu_clk_dpm_is_enabled(struct smu_context *smu, enum 
smu_clk_type clk_type)
return true;
 }
 
-
+/**
+ * smu_dpm_set_power_gate - power gate/ungate the specific IP block
+ *
+ * @smu:smu_context pointer
+ * @block_type: the IP block to power gate/ungate
+ * @gate:   to power gate if true, ungate otherwise
+ *
+ * This API uses no smu->mutex lock protection due to:
+ * 1. It is either called by other IP block(gfx/sdma/vcn/uvd/vce).
+ *This is guarded to be race condition free by the caller.
+ * 2. Or get called on user setting request of 
power_dpm_force_performance_level.
+ *Under this case, the smu->mutex lock protection is already enforced on
+ *the parent API smu_force_performance_level of the call path.
+ */
 int smu_dpm_set_power_gate(struct smu_context *smu, uint32_t block_type,
   bool gate)
 {
int ret = 0;
 
-   mutex_lock(&smu->mutex);
-
switch (block_type) {
case AMD_IP_BLOCK_TYPE_UVD:
ret = smu_dpm_set_uvd_enable(smu, gate); @@ -408,8 +419,6 @@ 
int smu_dpm_set_power_gate(struct smu_context *smu, uint32_t block_type,
break;
}
 
-   mutex_unlock(&smu->mutex);
-
return ret;
 }
 
--
2.23.0

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

RE: [PATCH 1/2] drm/amdgpu/gfx10: update gfx golden settings

2019-12-10 Thread Xu, Feifei




Series is Reviewed-by: Feifei Xu 

-Original Message-
From: Tianci Yin  
Sent: Wednesday, December 11, 2019 11:22 AM
To: amd-gfx@lists.freedesktop.org
Cc: Zhang, Hawking ; Xu, Feifei ; 
Yuan, Xiaojie ; Long, Gang ; Li, 
Pauline ; Yin, Tianci (Rico) 
Subject: [PATCH 1/2] drm/amdgpu/gfx10: update gfx golden settings

From: "Tianci.Yin" 

add registers: mmSPI_CONFIG_CNTL

Signed-off-by: Tianci.Yin 
---
 drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
index ed630d37c32c..f3324fa4e194 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
@@ -114,6 +114,7 @@ static const struct soc15_reg_golden 
golden_settings_gc_10_1[] =
SOC15_REG_GOLDEN_VALUE(GC, 0, mmPA_SC_ENHANCE_1, 0x0040, 
0x0444),
SOC15_REG_GOLDEN_VALUE(GC, 0, mmPA_SC_LINE_STIPPLE_STATE, 0xff0f, 
0x),
SOC15_REG_GOLDEN_VALUE(GC, 0, mmRMI_SPARE, 0x, 0x3101),
+   SOC15_REG_GOLDEN_VALUE(GC, 0, mmSPI_CONFIG_CNTL, 0x001f, 
0x00070104),
SOC15_REG_GOLDEN_VALUE(GC, 0, mmSQ_ALU_CLK_CTRL, 0x, 
0x),
SOC15_REG_GOLDEN_VALUE(GC, 0, mmSQ_ARB_CONFIG, 0x0100, 0x0130),
SOC15_REG_GOLDEN_VALUE(GC, 0, mmSQ_LDS_CLK_CTRL, 0x, 
0x),
-- 
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

RE: [PATCH] drm/amdgpu/gfx10: update gfx golden settings for navi12

2019-12-10 Thread Xu, Feifei




Reviewed-by: Feifei Xu 

-Original Message-
From: Tianci Yin  
Sent: Wednesday, December 11, 2019 2:09 PM
To: amd-gfx@lists.freedesktop.org
Cc: Zhang, Hawking ; Xu, Feifei ; 
Yuan, Xiaojie ; Long, Gang ; Li, 
Pauline ; Yin, Tianci (Rico) 
Subject: [PATCH] drm/amdgpu/gfx10: update gfx golden settings for navi12

From: "Tianci.Yin" 

add registers: mmSPI_CONFIG_CNTL
update registers: mmDB_DEBUG4 and mmUTCL1_CTRL

Signed-off-by: Tianci.Yin 
---
 drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
index db9b8bfb1c3c..557ebf317b5e 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
@@ -185,7 +185,7 @@ static const struct soc15_reg_golden 
golden_settings_gc_10_1_2[] =
SOC15_REG_GOLDEN_VALUE(GC, 0, mmDB_DEBUG, 0x, 0x2000),
SOC15_REG_GOLDEN_VALUE(GC, 0, mmDB_DEBUG2, 0x, 0x0420),
SOC15_REG_GOLDEN_VALUE(GC, 0, mmDB_DEBUG3, 0x, 0x0200),
-   SOC15_REG_GOLDEN_VALUE(GC, 0, mmDB_DEBUG4, 0x, 0x0480),
+   SOC15_REG_GOLDEN_VALUE(GC, 0, mmDB_DEBUG4, 0x, 0x0490),
SOC15_REG_GOLDEN_VALUE(GC, 0, mmDB_DFSM_TILES_IN_FLIGHT, 0x, 
0x003f),
SOC15_REG_GOLDEN_VALUE(GC, 0, mmDB_LAST_OF_BURST_CONFIG, 0x, 
0x03860204),
SOC15_REG_GOLDEN_VALUE(GC, 0, mmGCR_GENERAL_CNTL, 0x1ff0, 
0x0500), @@ -205,12 +205,13 @@ static const struct soc15_reg_golden 
golden_settings_gc_10_1_2[] =
SOC15_REG_GOLDEN_VALUE(GC, 0, mmPA_SC_ENHANCE_2, 0x0820, 
0x0820),
SOC15_REG_GOLDEN_VALUE(GC, 0, mmPA_SC_LINE_STIPPLE_STATE, 0xff0f, 
0x),
SOC15_REG_GOLDEN_VALUE(GC, 0, mmRMI_SPARE, 0x, 0x3101),
+   SOC15_REG_GOLDEN_VALUE(GC, 0, mmSPI_CONFIG_CNTL, 0x001f, 
+0x00070104),
SOC15_REG_GOLDEN_VALUE(GC, 0, mmSQ_ALU_CLK_CTRL, 0x, 
0x),
SOC15_REG_GOLDEN_VALUE(GC, 0, mmSQ_ARB_CONFIG, 0x0133, 0x0130),
SOC15_REG_GOLDEN_VALUE(GC, 0, mmSQ_LDS_CLK_CTRL, 0x, 
0x),
SOC15_REG_GOLDEN_VALUE(GC, 0, mmTA_CNTL_AUX, 0xfff7, 0x0103),
SOC15_REG_GOLDEN_VALUE(GC, 0, mmTCP_CNTL, 0xffdf80ff, 0x479c0010),
-   SOC15_REG_GOLDEN_VALUE(GC, 0, mmUTCL1_CTRL, 0x, 0x0080)
+   SOC15_REG_GOLDEN_VALUE(GC, 0, mmUTCL1_CTRL, 0x, 0x00c0)
 };
 
 static const struct soc15_reg_golden golden_settings_gc_10_1_nv14[] =
--
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

RE: [PATCH 2/2] drm/amdgpu/gfx10: update gfx golden settings for navi14

2019-12-11 Thread Xu, Feifei




Series is Reviewed-by: Feifei Xu 

-Original Message-
From: Tianci Yin  
Sent: Wednesday, December 11, 2019 8:00 PM
To: amd-gfx@lists.freedesktop.org
Cc: Zhang, Hawking ; Xu, Feifei ; 
Yuan, Xiaojie ; Long, Gang ; Li, 
Pauline ; Yin, Tianci (Rico) 
Subject: [PATCH 2/2] drm/amdgpu/gfx10: update gfx golden settings for navi14

From: "Tianci.Yin" 

add registers: mmPA_SC_BINNER_TIMEOUT_COUNTER and mmPA_SC_ENHANCE_2

Change-Id: I1fc3fb481b2d9edc482a32497242a8be6cd6b8d7
Signed-off-by: Tianci.Yin 
---
 drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
index e5637a6efb05..8cdef79de9d4 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
@@ -162,8 +162,10 @@ static const struct soc15_reg_golden 
golden_settings_gc_10_1_1[] =
SOC15_REG_GOLDEN_VALUE(GC, 0, mmGL2C_CGTT_SCLK_CTRL, 0x0fff, 
0x1100),
SOC15_REG_GOLDEN_VALUE(GC, 0, mmGL2C_CTRL2, 0x, 0x1402002f),
SOC15_REG_GOLDEN_VALUE(GC, 0, mmGL2C_CTRL3, 0xbfff, 0x0188),
+   SOC15_REG_GOLDEN_VALUE(GC, 0, mmPA_SC_BINNER_TIMEOUT_COUNTER, 
0x, 0x0800),
SOC15_REG_GOLDEN_VALUE(GC, 0, mmPA_SC_ENHANCE, 0x3fff, 0x0809),
SOC15_REG_GOLDEN_VALUE(GC, 0, mmPA_SC_ENHANCE_1, 0x0040, 
0x0444),
+   SOC15_REG_GOLDEN_VALUE(GC, 0, mmPA_SC_ENHANCE_2, 0x0800, 
0x0820),
SOC15_REG_GOLDEN_VALUE(GC, 0, mmPA_SC_LINE_STIPPLE_STATE, 0xff0f, 
0x),
SOC15_REG_GOLDEN_VALUE(GC, 0, mmRMI_SPARE, 0x, 0x3101),
SOC15_REG_GOLDEN_VALUE(GC, 0, mmSPI_CONFIG_CNTL, 0x001f, 
0x00070105),
-- 
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

RE: [PATCH] drm/amdgpu: Fix vce work queue was not cancelled when suspend

2018-09-27 Thread Xu, Feifei

Reviewed-by: Feifei Xu 

Thanks

Regards,
Feifei

-Original Message-
From: amd-gfx  On Behalf Of Rex Zhu
Sent: 2018年9月27日 20:49
To: amd-gfx@lists.freedesktop.org
Cc: Zhu, Rex 
Subject: [PATCH] drm/amdgpu: Fix vce work queue was not cancelled when suspend

The vce cancel_delayed_work_sync never be called.
driver call the function in error path.

This caused the A+A suspend hang when runtime pm enebled.
As we will visit the smu in the idle queue. this will cause smu hang because 
the dgpu has been suspend, and the dgpu also will be waked up. As the smu has 
been hang, so the dgpu resume will failed.

Signed-off-by: Rex Zhu 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c | 3 ++-  
drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c | 4 ++--
 2 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
index 0cc5190..5f3f540 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
@@ -258,6 +258,8 @@ int amdgpu_vce_suspend(struct amdgpu_device *adev)  {
int i;
 
+   cancel_delayed_work_sync(&adev->vce.idle_work);
+
if (adev->vce.vcpu_bo == NULL)
return 0;
 
@@ -268,7 +270,6 @@ int amdgpu_vce_suspend(struct amdgpu_device *adev)
if (i == AMDGPU_MAX_VCE_HANDLES)
return 0;
 
-   cancel_delayed_work_sync(&adev->vce.idle_work);
/* TODO: suspending running encoding sessions isn't supported */
return -EINVAL;
 }
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
index a73674f..fb7df63 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
@@ -162,11 +162,11 @@ int amdgpu_vcn_suspend(struct amdgpu_device *adev)
unsigned size;
void *ptr;
 
+   cancel_delayed_work_sync(&adev->vcn.idle_work);
+
if (adev->vcn.vcpu_bo == NULL)
return 0;
 
-   cancel_delayed_work_sync(&adev->vcn.idle_work);
-
size = amdgpu_bo_size(adev->vcn.vcpu_bo);
ptr = adev->vcn.cpu_addr;
 
--
1.9.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

RE: [PATCH 3/3] drm/amd/powerplay: update PPtable with DC BTC and Tvr SocLimit fields

2018-10-15 Thread Xu, Feifei

Series reviewed by: Feifei Xu 

Regards,
Feifei

-Original Message-
From: Evan Quan  
Sent: 2018年10月16日 10:52
To: amd-gfx@lists.freedesktop.org
Cc: Xu, Feifei ; Deucher, Alexander 
; Quan, Evan 
Subject: [PATCH 3/3] drm/amd/powerplay: update PPtable with DC BTC and Tvr 
SocLimit fields

Update the PPtable structure to fit the latest SMC firmware.

Change-Id: I97db5955085efa1ecf44ae23d26fdcc70ec2fc9a
Signed-off-by: Evan Quan 
---
 .../amd/powerplay/hwmgr/vega20_processpptables.c| 10 ++
 drivers/gpu/drm/amd/powerplay/inc/smu11_driver_if.h | 13 -
 2 files changed, 14 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/vega20_processpptables.c 
b/drivers/gpu/drm/amd/powerplay/hwmgr/vega20_processpptables.c
index e71740479bb8..e5f7f8230065 100644
--- a/drivers/gpu/drm/amd/powerplay/hwmgr/vega20_processpptables.c
+++ b/drivers/gpu/drm/amd/powerplay/hwmgr/vega20_processpptables.c
@@ -100,9 +100,8 @@ static void dump_pptable(PPTable_t *pptable)
pr_info("PpmTemperatureThreshold = %d\n", 
pptable->PpmTemperatureThreshold);
 
pr_info("MemoryOnPackage = 0x%02x\n", pptable->MemoryOnPackage);
-   pr_info("padding8_limits[0] = 0x%02x\n", pptable->padding8_limits[0]);
-   pr_info("padding8_limits[1] = 0x%02x\n", pptable->padding8_limits[1]);
-   pr_info("padding8_limits[2] = 0x%02x\n", pptable->padding8_limits[2]);
+   pr_info("padding8_limits = 0x%02x\n", pptable->padding8_limits);
+   pr_info("Tvr_SocLimit = %d\n", pptable->Tvr_SocLimit);
 
pr_info("UlvVoltageOffsetSoc = %d\n", pptable->UlvVoltageOffsetSoc);
pr_info("UlvVoltageOffsetGfx = %d\n", pptable->UlvVoltageOffsetGfx); @@ 
-539,7 +538,10 @@ static void dump_pptable(PPTable_t *pptable)
pr_info("FanGainVrMem0 = %d\n", pptable->FanGainVrMem0);
pr_info("FanGainVrMem0 = %d\n", pptable->FanGainVrMem0);
 
-   for (i = 0; i < 12; i++)
+   pr_info("DcBtcGb[AVFS_VOLTAGE_GFX] = 0x%x\n", 
pptable->DcBtcGb[AVFS_VOLTAGE_GFX]);
+   pr_info("DcBtcGb[AVFS_VOLTAGE_SOC] = 0x%x\n", 
+pptable->DcBtcGb[AVFS_VOLTAGE_SOC]);
+
+   for (i = 0; i < 11; i++)
pr_info("Reserved[%d] = 0x%x\n", i, pptable->Reserved[i]);
 
for (i = 0; i < 3; i++)
diff --git a/drivers/gpu/drm/amd/powerplay/inc/smu11_driver_if.h 
b/drivers/gpu/drm/amd/powerplay/inc/smu11_driver_if.h
index c72cfab83df9..2998a49960ed 100644
--- a/drivers/gpu/drm/amd/powerplay/inc/smu11_driver_if.h
+++ b/drivers/gpu/drm/amd/powerplay/inc/smu11_driver_if.h
@@ -165,7 +165,7 @@
 #define FEATURE_DS_FCLK_MASK(1 << FEATURE_DS_FCLK_BIT)
 #define FEATURE_DS_MP1CLK_MASK  (1 << FEATURE_DS_MP1CLK_BIT  )
 #define FEATURE_DS_MP0CLK_MASK  (1 << FEATURE_DS_MP0CLK_BIT  )
-
+#define FEATURE_XGMI_MASK   (1 << FEATURE_XGMI_BIT   )
 
 #define DPM_OVERRIDE_DISABLE_SOCCLK_PID 0x0001
 #define DPM_OVERRIDE_DISABLE_UCLK_PID   0x0002
@@ -391,8 +391,8 @@ typedef struct {
   uint16_t PpmTemperatureThreshold;
 
   uint8_t  MemoryOnPackage;
-  uint8_t  padding8_limits[3];
-
+  uint8_t  padding8_limits;
+  uint16_t Tvr_SocLimit;
 
   uint16_t  UlvVoltageOffsetSoc;
   uint16_t  UlvVoltageOffsetGfx;
@@ -501,7 +501,7 @@ typedef struct {
   uint8_t   DcBtcEnabled[AVFS_VOLTAGE_COUNT];
   uint8_t   Padding8_GfxBtc[2];
 
-  uint16_t  DcBtcMin[AVFS_VOLTAGE_COUNT];
+  int16_t   DcBtcMin[AVFS_VOLTAGE_COUNT];
   uint16_t  DcBtcMax[AVFS_VOLTAGE_COUNT];
 
 
@@ -526,7 +526,10 @@ typedef struct {
 
   uint16_t FanGainVrMem0;
   uint16_t FanGainVrMem1;
-  uint32_t Reserved[12];
+
+  uint16_t DcBtcGb[AVFS_VOLTAGE_COUNT];
+
+  uint32_t Reserved[11];
 
   uint32_t Padding32[3];
 
--
2.19.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

RE: [PATCH] drm/amd/powerplay: bump the PPtable version supported

2018-10-19 Thread Xu, Feifei

Reviewed-by: Feifei Xu 

-Original Message-
From: amd-gfx  On Behalf Of Evan Quan
Sent: Friday, October 19, 2018 4:59 PM
To: amd-gfx@lists.freedesktop.org
Cc: Deucher, Alexander ; Xu, Feifei 
; Quan, Evan 
Subject: [PATCH] drm/amd/powerplay: bump the PPtable version supported

As the matching VBIOS is already ready. Also drop the temporary workarounds 
applied before.

Change-Id: If5b78298bc0817b06e11aba49d390fa341d714b4
Signed-off-by: Evan Quan 
---
 .../powerplay/hwmgr/vega20_processpptables.c  | 46 +++
 .../drm/amd/powerplay/inc/smu11_driver_if.h   |  2 +-
 2 files changed, 18 insertions(+), 30 deletions(-)

diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/vega20_processpptables.c 
b/drivers/gpu/drm/amd/powerplay/hwmgr/vega20_processpptables.c
index e5f7f8230065..f7e8bbdc20b0 100644
--- a/drivers/gpu/drm/amd/powerplay/hwmgr/vega20_processpptables.c
+++ b/drivers/gpu/drm/amd/powerplay/hwmgr/vega20_processpptables.c
@@ -716,10 +716,6 @@ static int append_vbios_pptable(struct pp_hwmgr *hwmgr, 
PPTable_t *ppsmc_pptable
"[appendVbiosPPTable] Failed to retrieve Smc Dpm Table from 
VBIOS!",
return -1);
 
-   memset(ppsmc_pptable->Padding32,
-   0,
-   sizeof(struct atom_smc_dpm_info_v4_4) -
-   sizeof(struct atom_common_table_header));
ppsmc_pptable->MaxVoltageStepGfx = smc_dpm_table->maxvoltagestepgfx;
ppsmc_pptable->MaxVoltageStepSoc = smc_dpm_table->maxvoltagestepsoc;
 
@@ -778,22 +774,19 @@ static int append_vbios_pptable(struct pp_hwmgr *hwmgr, 
PPTable_t *ppsmc_pptable
ppsmc_pptable->FllGfxclkSpreadPercent = 
smc_dpm_table->fllgfxclkspreadpercent;
ppsmc_pptable->FllGfxclkSpreadFreq = smc_dpm_table->fllgfxclkspreadfreq;
 
-   if ((smc_dpm_table->table_header.format_revision == 4) &&
-   (smc_dpm_table->table_header.content_revision == 4)) {
-   for (i = 0; i < I2C_CONTROLLER_NAME_COUNT; i++) {
-   ppsmc_pptable->I2cControllers[i].Enabled =
-   smc_dpm_table->i2ccontrollers[i].enabled;
-   ppsmc_pptable->I2cControllers[i].SlaveAddress =
-   smc_dpm_table->i2ccontrollers[i].slaveaddress;
-   ppsmc_pptable->I2cControllers[i].ControllerPort =
-   smc_dpm_table->i2ccontrollers[i].controllerport;
-   ppsmc_pptable->I2cControllers[i].ThermalThrottler =
-   
smc_dpm_table->i2ccontrollers[i].thermalthrottler;
-   ppsmc_pptable->I2cControllers[i].I2cProtocol =
-   smc_dpm_table->i2ccontrollers[i].i2cprotocol;
-   ppsmc_pptable->I2cControllers[i].I2cSpeed =
-   smc_dpm_table->i2ccontrollers[i].i2cspeed;
-   }
+   for (i = 0; i < I2C_CONTROLLER_NAME_COUNT; i++) {
+   ppsmc_pptable->I2cControllers[i].Enabled =
+   smc_dpm_table->i2ccontrollers[i].enabled;
+   ppsmc_pptable->I2cControllers[i].SlaveAddress =
+   smc_dpm_table->i2ccontrollers[i].slaveaddress;
+   ppsmc_pptable->I2cControllers[i].ControllerPort =
+   smc_dpm_table->i2ccontrollers[i].controllerport;
+   ppsmc_pptable->I2cControllers[i].ThermalThrottler =
+   smc_dpm_table->i2ccontrollers[i].thermalthrottler;
+   ppsmc_pptable->I2cControllers[i].I2cProtocol =
+   smc_dpm_table->i2ccontrollers[i].i2cprotocol;
+   ppsmc_pptable->I2cControllers[i].I2cSpeed =
+   smc_dpm_table->i2ccontrollers[i].i2cspeed;
}
 
return 0;
@@ -882,15 +875,10 @@ static int init_powerplay_table_information(
if (pptable_information->smc_pptable == NULL)
return -ENOMEM;
 
-   if (powerplay_table->smcPPTable.Version <= 2)
-   memcpy(pptable_information->smc_pptable,
-   &(powerplay_table->smcPPTable),
-   sizeof(PPTable_t) -
-   sizeof(I2cControllerConfig_t) * 
I2C_CONTROLLER_NAME_COUNT);
-   else
-   memcpy(pptable_information->smc_pptable,
-   &(powerplay_table->smcPPTable),
-   sizeof(PPTable_t));
+   memcpy(pptable_information->smc_pptable,
+   &(powerplay_table->smcPPTable),
+   sizeof(PPTable_t));
+
 
result = append_vbios_pptable(hwmgr, 
(pptable_information->smc_pptable));
 
diff --git a/drivers/gpu/drm/amd/powerplay/inc/smu11_driver_if.h 
b/drivers/gpu/drm/amd/powerplay/inc/smu11_driver_if.h

RE: [PATCH] drm/amd/powerplay: commit get_performance_level API as DAL needed

2018-10-22 Thread Xu, Feifei

Reviewed-by: Feifei Xu 

-Original Message-
From: amd-gfx  On Behalf Of Evan Quan
Sent: Monday, October 22, 2018 3:20 PM
To: amd-gfx@lists.freedesktop.org
Cc: Deucher, Alexander ; Xu, Feifei 
; Quan, Evan 
Subject: [PATCH] drm/amd/powerplay: commit get_performance_level API as DAL 
needed

This can suppress the error reported on driver loading. Also these are empty 
APIs as Vega12/Vega20 has no performance levels.

Change-Id: Ifa322a0e57fe3be4bfd9503f26e8deb7daab096d
Signed-off-by: Evan Quan 
---
 drivers/gpu/drm/amd/powerplay/hwmgr/vega12_hwmgr.c | 8   
drivers/gpu/drm/amd/powerplay/hwmgr/vega20_hwmgr.c | 9 +
 2 files changed, 17 insertions(+)

diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/vega12_hwmgr.c 
b/drivers/gpu/drm/amd/powerplay/hwmgr/vega12_hwmgr.c
index 9600e2f226e9..74bc37308dc0 100644
--- a/drivers/gpu/drm/amd/powerplay/hwmgr/vega12_hwmgr.c
+++ b/drivers/gpu/drm/amd/powerplay/hwmgr/vega12_hwmgr.c
@@ -2356,6 +2356,13 @@ static int vega12_gfx_off_control(struct pp_hwmgr 
*hwmgr, bool enable)
return vega12_disable_gfx_off(hwmgr);  }
 
+static int vega12_get_performance_level(struct pp_hwmgr *hwmgr, const struct 
pp_hw_power_state *state,
+   PHM_PerformanceLevelDesignation designation, 
uint32_t index,
+   PHM_PerformanceLevel *level)
+{
+   return 0;
+}
+
 static const struct pp_hwmgr_func vega12_hwmgr_funcs = {
.backend_init = vega12_hwmgr_backend_init,
.backend_fini = vega12_hwmgr_backend_fini, @@ -2406,6 +2413,7 @@ static 
const struct pp_hwmgr_func vega12_hwmgr_funcs = {
.register_irq_handlers = smu9_register_irq_handlers,
.start_thermal_controller = vega12_start_thermal_controller,
.powergate_gfx = vega12_gfx_off_control,
+   .get_performance_level = vega12_get_performance_level,
 };
 
 int vega12_hwmgr_init(struct pp_hwmgr *hwmgr) diff --git 
a/drivers/gpu/drm/amd/powerplay/hwmgr/vega20_hwmgr.c 
b/drivers/gpu/drm/amd/powerplay/hwmgr/vega20_hwmgr.c
index b4dbbb7c334c..894eae4b9d21 100644
--- a/drivers/gpu/drm/amd/powerplay/hwmgr/vega20_hwmgr.c
+++ b/drivers/gpu/drm/amd/powerplay/hwmgr/vega20_hwmgr.c
@@ -2041,6 +2041,13 @@ int vega20_display_clock_voltage_request(struct pp_hwmgr 
*hwmgr,
return result;
 }
 
+static int vega20_get_performance_level(struct pp_hwmgr *hwmgr, const struct 
pp_hw_power_state *state,
+   PHM_PerformanceLevelDesignation designation, 
uint32_t index,
+   PHM_PerformanceLevel *level)
+{
+   return 0;
+}
+
 static int vega20_notify_smc_display_config_after_ps_adjustment(
struct pp_hwmgr *hwmgr)
 {
@@ -3476,6 +3483,8 @@ static const struct pp_hwmgr_func vega20_hwmgr_funcs = {
vega20_set_watermarks_for_clocks_ranges,
.display_clock_voltage_request =
vega20_display_clock_voltage_request,
+   .get_performance_level =
+   vega20_get_performance_level,
/* UMD pstate, profile related */
.force_dpm_level =
vega20_dpm_force_dpm_level,
--
2.19.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

RE: [PATCH 2/2] drm/amd/powerplay: commonize the API for retrieving current clocks

2018-10-24 Thread Xu, Feifei

Reviewed-by: Feifei Xu

Regards,
Feifei

-Original Message-
From: amd-gfx  On Behalf Of Evan Quan
Sent: 2018年10月24日 16:09
To: amd-gfx@lists.freedesktop.org
Cc: Quan, Evan 
Subject: [PATCH 2/2] drm/amd/powerplay: commonize the API for retrieving 
current clocks

So that it can be shared between all clocks.

Change-Id: Ibac99b2aa81c1cb3e988b4eae6c98d32b7f35bed
Signed-off-by: Evan Quan 
---
 .../drm/amd/powerplay/hwmgr/vega20_hwmgr.c| 44 +++
 1 file changed, 15 insertions(+), 29 deletions(-)

diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/vega20_hwmgr.c 
b/drivers/gpu/drm/amd/powerplay/hwmgr/vega20_hwmgr.c
index 8a1ee9ce7386..57143d51e3ee 100644
--- a/drivers/gpu/drm/amd/powerplay/hwmgr/vega20_hwmgr.c
+++ b/drivers/gpu/drm/amd/powerplay/hwmgr/vega20_hwmgr.c
@@ -1875,38 +1875,20 @@ static int vega20_get_gpu_power(struct pp_hwmgr *hwmgr,
return ret;
 }
 
-static int vega20_get_current_gfx_clk_freq(struct pp_hwmgr *hwmgr, uint32_t 
*gfx_freq)
+static int vega20_get_current_clk_freq(struct pp_hwmgr *hwmgr,
+   PPCLK_e clk_id, uint32_t *clk_freq)
 {
-   uint32_t gfx_clk = 0;
int ret = 0;
 
-   *gfx_freq = 0;
+   *clk_freq = 0;
 
PP_ASSERT_WITH_CODE((ret = smum_send_msg_to_smc_with_parameter(hwmgr,
-   PPSMC_MSG_GetDpmClockFreq, (PPCLK_GFXCLK << 16))) == 0,
-   "[GetCurrentGfxClkFreq] Attempt to get Current GFXCLK 
Frequency Failed!",
+   PPSMC_MSG_GetDpmClockFreq, (clk_id << 16))) == 0,
+   "[GetCurrentClkFreq] Attempt to get Current Frequency 
Failed!",
return ret);
-   gfx_clk = smum_get_argument(hwmgr);
+   *clk_freq = smum_get_argument(hwmgr);
 
-   *gfx_freq = gfx_clk * 100;
-
-   return 0;
-}
-
-static int vega20_get_current_mclk_freq(struct pp_hwmgr *hwmgr, uint32_t 
*mclk_freq) -{
-   uint32_t mem_clk = 0;
-   int ret = 0;
-
-   *mclk_freq = 0;
-
-   PP_ASSERT_WITH_CODE((ret = smum_send_msg_to_smc_with_parameter(hwmgr,
-   PPSMC_MSG_GetDpmClockFreq, (PPCLK_UCLK << 16))) == 0,
-   "[GetCurrentMClkFreq] Attempt to get Current MCLK 
Frequency Failed!",
-   return ret);
-   mem_clk = smum_get_argument(hwmgr);
-
-   *mclk_freq = mem_clk * 100;
+   *clk_freq = *clk_freq * 100;
 
return 0;
 }
@@ -1937,12 +1919,16 @@ static int vega20_read_sensor(struct pp_hwmgr *hwmgr, 
int idx,
 
switch (idx) {
case AMDGPU_PP_SENSOR_GFX_SCLK:
-   ret = vega20_get_current_gfx_clk_freq(hwmgr, (uint32_t *)value);
+   ret = vega20_get_current_clk_freq(hwmgr,
+   PPCLK_GFXCLK,
+   (uint32_t *)value);
if (!ret)
*size = 4;
break;
case AMDGPU_PP_SENSOR_GFX_MCLK:
-   ret = vega20_get_current_mclk_freq(hwmgr, (uint32_t *)value);
+   ret = vega20_get_current_clk_freq(hwmgr,
+   PPCLK_UCLK,
+   (uint32_t *)value);
if (!ret)
*size = 4;
break;
@@ -2743,7 +2729,7 @@ static int vega20_print_clock_levels(struct pp_hwmgr 
*hwmgr,
 
switch (type) {
case PP_SCLK:
-   ret = vega20_get_current_gfx_clk_freq(hwmgr, &now);
+   ret = vega20_get_current_clk_freq(hwmgr, PPCLK_GFXCLK, &now);
PP_ASSERT_WITH_CODE(!ret,
"Attempt to get current gfx clk Failed!",
return ret);
@@ -2760,7 +2746,7 @@ static int vega20_print_clock_levels(struct pp_hwmgr 
*hwmgr,
break;
 
case PP_MCLK:
-   ret = vega20_get_current_mclk_freq(hwmgr, &now);
+   ret = vega20_get_current_clk_freq(hwmgr, PPCLK_UCLK, &now);
PP_ASSERT_WITH_CODE(!ret,
"Attempt to get current mclk freq Failed!",
return ret);
--
2.19.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

RE: [PATCH 1/2] drm/amd/powerplay: correct the clocks for DAL to be Khz unit

2018-10-24 Thread Xu, Feifei

Reviewed-by: Feifei Xu

Regards,
Feifei

-Original Message-
From: amd-gfx  On Behalf Of Evan Quan
Sent: 2018年10月24日 16:09
To: amd-gfx@lists.freedesktop.org
Cc: Quan, Evan 
Subject: [PATCH 1/2] drm/amd/powerplay: correct the clocks for DAL to be Khz 
unit

Currently the clocks reported are in 10Khz unit. Correct them as Khz unit as 
DAL wanted.

Change-Id: I91e9f4b460efbdc0ba223901b6c40e576523686d
Signed-off-by: Evan Quan 
---
 .../drm/amd/powerplay/hwmgr/vega20_hwmgr.c| 21 +--
 1 file changed, 10 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/vega20_hwmgr.c 
b/drivers/gpu/drm/amd/powerplay/hwmgr/vega20_hwmgr.c
index 4c9a1a9ef04b..8a1ee9ce7386 100644
--- a/drivers/gpu/drm/amd/powerplay/hwmgr/vega20_hwmgr.c
+++ b/drivers/gpu/drm/amd/powerplay/hwmgr/vega20_hwmgr.c
@@ -2012,7 +2012,6 @@ int vega20_display_clock_voltage_request(struct pp_hwmgr 
*hwmgr,
if (data->smu_features[GNLD_DPM_DCEFCLK].enabled) {
switch (clk_type) {
case amd_pp_dcef_clock:
-   clk_freq = clock_req->clock_freq_in_khz / 100;
clk_select = PPCLK_DCEFCLK;
break;
case amd_pp_disp_clock:
@@ -2072,7 +2071,7 @@ static int 
vega20_notify_smc_display_config_after_ps_adjustment(
 
if (data->smu_features[GNLD_DPM_DCEFCLK].supported) {
clock_req.clock_type = amd_pp_dcef_clock;
-   clock_req.clock_freq_in_khz = min_clocks.dcefClock;
+   clock_req.clock_freq_in_khz = min_clocks.dcefClock * 10;
if (!vega20_display_clock_voltage_request(hwmgr, &clock_req)) {
if (data->smu_features[GNLD_DS_DCEFCLK].supported)
PP_ASSERT_WITH_CODE((ret = 
smum_send_msg_to_smc_with_parameter(
@@ -2371,7 +2370,7 @@ static int vega20_get_sclks(struct pp_hwmgr *hwmgr,
 
for (i = 0; i < count; i++) {
clocks->data[i].clocks_in_khz =
-   dpm_table->dpm_levels[i].value * 100;
+   dpm_table->dpm_levels[i].value * 1000;
clocks->data[i].latency_in_us = 0;
}
 
@@ -2401,7 +2400,7 @@ static int vega20_get_memclocks(struct pp_hwmgr *hwmgr,
for (i = 0; i < count; i++) {
clocks->data[i].clocks_in_khz =
data->mclk_latency_table.entries[i].frequency =
-   dpm_table->dpm_levels[i].value * 100;
+   dpm_table->dpm_levels[i].value * 1000;
clocks->data[i].latency_in_us =
data->mclk_latency_table.entries[i].latency =
vega20_get_mem_latency(hwmgr, 
dpm_table->dpm_levels[i].value); @@ -2426,7 +2425,7 @@ static int 
vega20_get_dcefclocks(struct pp_hwmgr *hwmgr,
 
for (i = 0; i < count; i++) {
clocks->data[i].clocks_in_khz =
-   dpm_table->dpm_levels[i].value * 100;
+   dpm_table->dpm_levels[i].value * 1000;
clocks->data[i].latency_in_us = 0;
}
 
@@ -2449,7 +2448,7 @@ static int vega20_get_socclocks(struct pp_hwmgr *hwmgr,
 
for (i = 0; i < count; i++) {
clocks->data[i].clocks_in_khz =
-   dpm_table->dpm_levels[i].value * 100;
+   dpm_table->dpm_levels[i].value * 1000;
clocks->data[i].latency_in_us = 0;
}
 
@@ -2600,11 +2599,11 @@ static int vega20_odn_edit_dpm_table(struct pp_hwmgr 
*hwmgr,
return -EINVAL;
}
 
-   if (input_clk < clocks.data[0].clocks_in_khz / 100 ||
+   if (input_clk < clocks.data[0].clocks_in_khz / 1000 ||
input_clk > 
od8_settings[OD8_SETTING_UCLK_FMAX].max_value) {
pr_info("clock freq %d is not within allowed 
range [%d - %d]\n",
input_clk,
-   clocks.data[0].clocks_in_khz / 100,
+   clocks.data[0].clocks_in_khz / 1000,

od8_settings[OD8_SETTING_UCLK_FMAX].max_value);
return -EINVAL;
}
@@ -2756,7 +2755,7 @@ static int vega20_print_clock_levels(struct pp_hwmgr 
*hwmgr,
 
for (i = 0; i < clocks.num_levels; i++)
size += sprintf(buf + size, "%d: %uMhz %s\n",
-   i, clocks.data[i].clocks_in_khz / 100,
+   i, clocks.data[i].clocks_in_khz / 1000,
(clocks.data[i].clocks_in_khz == now) ? "*" : 
"");
break;
 
@@ -2773,7 +2772,7 @@ static int vega20_print_clock_levels(struct pp_hwmgr 
*hwmgr,
 
for (i = 0; i < clocks.num_levels; i++)
size += sprin

RE: [PATCH 1/3] drm/amd/powerplay: set a default fclk/gfxclk ratio

2018-11-06 Thread Xu, Feifei

Seriel is reviewed-by: Feifei Xu 

-Original Message-
From: amd-gfx  On Behalf Of Evan Quan
Sent: Wednesday, November 7, 2018 9:38 AM
To: amd-gfx@lists.freedesktop.org
Cc: Quan, Evan 
Subject: [PATCH 1/3] drm/amd/powerplay: set a default fclk/gfxclk ratio

Otherwise big gap between these two clocks may causes some hangs.

Change-Id: Ifa3fafe2ee619d6231d5ecab61d3c68faa34abb6
Signed-off-by: Evan Quan 
---
 .../gpu/drm/amd/powerplay/hwmgr/vega20_hwmgr.c   | 16 
 .../gpu/drm/amd/powerplay/hwmgr/vega20_hwmgr.h   |  1 +
 drivers/gpu/drm/amd/powerplay/inc/vega20_ppsmc.h |  3 ++-
 3 files changed, 19 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/vega20_hwmgr.c 
b/drivers/gpu/drm/amd/powerplay/hwmgr/vega20_hwmgr.c
index d2da9e3fc827..4f0f444fd111 100644
--- a/drivers/gpu/drm/amd/powerplay/hwmgr/vega20_hwmgr.c
+++ b/drivers/gpu/drm/amd/powerplay/hwmgr/vega20_hwmgr.c
@@ -120,6 +120,7 @@ static void vega20_set_default_registry_data(struct 
pp_hwmgr *hwmgr)
data->registry_data.disable_auto_wattman = 1;
data->registry_data.auto_wattman_debug = 0;
data->registry_data.auto_wattman_sample_period = 100;
+   data->registry_data.fclk_gfxclk_ratio = 0x3F6D;
data->registry_data.auto_wattman_threshold = 50;
data->registry_data.gfxoff_controlled_by_driver = 1;
data->gfxoff_allowed = false;
@@ -829,6 +830,16 @@ static int vega20_enable_all_smu_features(struct pp_hwmgr 
*hwmgr)
return 0;
 }
 
+static int vega20_send_clock_ratio(struct pp_hwmgr *hwmgr) {
+   struct vega20_hwmgr *data =
+   (struct vega20_hwmgr *)(hwmgr->backend);
+
+   return smum_send_msg_to_smc_with_parameter(hwmgr,
+   PPSMC_MSG_SetFclkGfxClkRatio,
+   data->registry_data.fclk_gfxclk_ratio);
+}
+
 static int vega20_disable_all_smu_features(struct pp_hwmgr *hwmgr)  {
struct vega20_hwmgr *data =
@@ -1535,6 +1546,11 @@ static int vega20_enable_dpm_tasks(struct pp_hwmgr 
*hwmgr)
"[EnableDPMTasks] Failed to enable all smu features!",
return result);
 
+   result = vega20_send_clock_ratio(hwmgr);
+   PP_ASSERT_WITH_CODE(!result,
+   "[EnableDPMTasks] Failed to send clock ratio!",
+   return result);
+
/* Initialize UVD/VCE powergating state */
vega20_init_powergate_state(hwmgr);
 
diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/vega20_hwmgr.h 
b/drivers/gpu/drm/amd/powerplay/hwmgr/vega20_hwmgr.h
index 56fe6a0d42e8..25faaa5c5b10 100644
--- a/drivers/gpu/drm/amd/powerplay/hwmgr/vega20_hwmgr.h
+++ b/drivers/gpu/drm/amd/powerplay/hwmgr/vega20_hwmgr.h
@@ -328,6 +328,7 @@ struct vega20_registry_data {
uint8_t   disable_auto_wattman;
uint32_t  auto_wattman_debug;
uint32_t  auto_wattman_sample_period;
+   uint32_t  fclk_gfxclk_ratio;
uint8_t   auto_wattman_threshold;
uint8_t   log_avfs_param;
uint8_t   enable_enginess;
diff --git a/drivers/gpu/drm/amd/powerplay/inc/vega20_ppsmc.h 
b/drivers/gpu/drm/amd/powerplay/inc/vega20_ppsmc.h
index 45d64a81e945..4f63a736ea0e 100644
--- a/drivers/gpu/drm/amd/powerplay/inc/vega20_ppsmc.h
+++ b/drivers/gpu/drm/amd/powerplay/inc/vega20_ppsmc.h
@@ -105,7 +105,8 @@
 #define PPSMC_MSG_SetSystemVirtualDramAddrHigh   0x4B
 #define PPSMC_MSG_SetSystemVirtualDramAddrLow0x4C
 #define PPSMC_MSG_WaflTest   0x4D
-// Unused ID 0x4E to 0x50
+#define PPSMC_MSG_SetFclkGfxClkRatio 0x4E
+// Unused ID 0x4F to 0x50
 #define PPSMC_MSG_AllowGfxOff0x51
 #define PPSMC_MSG_DisallowGfxOff 0x52
 #define PPSMC_MSG_GetPptLimit0x53
--
2.19.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

RE: [PATCH] drm/amd/powerplay: disable Vega20 DS related features

2018-11-19 Thread Xu, Feifei

Acked-by: Feifei Xu

-Original Message-
From: amd-gfx  On Behalf Of Evan Quan
Sent: Monday, November 19, 2018 12:20 PM
To: amd-gfx@lists.freedesktop.org
Cc: Quan, Evan 
Subject: [PATCH] drm/amd/powerplay: disable Vega20 DS related features

Disable these features on Vega20 for now.

Change-Id: I9e826fca3ed3d8001d1b90787d733ca00edd0f54
Signed-off-by: Evan Quan 
---
 drivers/gpu/drm/amd/powerplay/hwmgr/vega20_hwmgr.c | 12 +++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/vega20_hwmgr.c 
b/drivers/gpu/drm/amd/powerplay/hwmgr/vega20_hwmgr.c
index f2daf00cc911..9a773d8e880d 100644
--- a/drivers/gpu/drm/amd/powerplay/hwmgr/vega20_hwmgr.c
+++ b/drivers/gpu/drm/amd/powerplay/hwmgr/vega20_hwmgr.c
@@ -75,7 +75,17 @@ static void vega20_set_default_registry_data(struct pp_hwmgr 
*hwmgr)
data->phy_clk_quad_eqn_b = PPREGKEY_VEGA20QUADRATICEQUATION_DFLT;
data->phy_clk_quad_eqn_c = PPREGKEY_VEGA20QUADRATICEQUATION_DFLT;
 
-   data->registry_data.disallowed_features = 0x0;
+   /*
+* Disable the following features for now:
+*   GFXCLK DS
+*   SOCLK DS
+*   LCLK DS
+*   DCEFCLK DS
+*   FCLK DS
+*   MP1CLK DS
+*   MP0CLK DS
+*/
+   data->registry_data.disallowed_features = 0xE0041C00;
data->registry_data.od_state_in_dc_support = 0;
data->registry_data.thermal_support = 1;
data->registry_data.skip_baco_hardware = 0;
-- 
2.19.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

RE: [PATCH] drm/amdgpu:skip ASIC_INIT when posting card on vg20

2018-12-09 Thread Xu, Feifei

Tested on pro stack vega20.

-Original Message-
From: amd-gfx  On Behalf Of Feifei Xu
Sent: Monday, December 10, 2018 12:46 PM
To: amd-gfx@lists.freedesktop.org
Cc: Xu, Feifei 
Subject: [PATCH] drm/amdgpu:skip ASIC_INIT when posting card on vg20

On vega20, the job of executing the ASIC_INIT table when posting card is moved 
to psp. Skip the atombios's ASIC_INIT on vega20 when posting card.

Change-Id: Id1d3c0a0d19296d5ed804de7edf5b09b8d38c0a5
Signed-off-by: Feifei Xu 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index f2bda76c8e05..310d4eb0536b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -2513,8 +2513,9 @@ int amdgpu_device_init(struct amdgpu_device *adev,
/* detect if we are with an SRIOV vbios */
amdgpu_device_detect_sriov_bios(adev);
 
+DRM_INFO("skip posting card using ASIC INIT table in vbios on 
+ vega20\n");
/* Post card if necessary */
-   if (amdgpu_device_need_post(adev)) {
+   if ((adev->asic_type != CHIP_VEGA20) && amdgpu_device_need_post(adev)) 
+{
if (!adev->bios) {
dev_err(adev->dev, "no vBIOS found\n");
r = -EINVAL;
--
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

RE: [PATCH] drm/amdgpu:skip ASIC_INIT when posting card on vg20

2018-12-09 Thread Xu, Feifei

I will remove the DRM_INFO and send v2. Thanks.

-Original Message-
From: Quan, Evan  
Sent: Monday, December 10, 2018 1:10 PM
To: Xu, Feifei ; amd-gfx@lists.freedesktop.org
Cc: Xu, Feifei 
Subject: RE: [PATCH] drm/amdgpu:skip ASIC_INIT when posting card on vg20

Is the DRM_INFO print necessary?
And it will get printed even running on other ASIC.

Regards,
Evan
> -Original Message-
> From: amd-gfx  On Behalf Of 
> Feifei Xu
> Sent: 2018年12月10日 12:46
> To: amd-gfx@lists.freedesktop.org
> Cc: Xu, Feifei 
> Subject: [PATCH] drm/amdgpu:skip ASIC_INIT when posting card on vg20
> 
> On vega20, the job of executing the ASIC_INIT table when posting card 
> is moved to psp. Skip the atombios's ASIC_INIT on vega20 when posting card.
> 
> Change-Id: Id1d3c0a0d19296d5ed804de7edf5b09b8d38c0a5
> Signed-off-by: Feifei Xu 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index f2bda76c8e05..310d4eb0536b 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -2513,8 +2513,9 @@ int amdgpu_device_init(struct amdgpu_device 
> *adev,
>   /* detect if we are with an SRIOV vbios */
>   amdgpu_device_detect_sriov_bios(adev);
> 
> +DRM_INFO("skip posting card using ASIC INIT table in vbios on 
> + vega20\n");
>   /* Post card if necessary */
> - if (amdgpu_device_need_post(adev)) {
> + if ((adev->asic_type != CHIP_VEGA20) &&
> amdgpu_device_need_post(adev))
> +{
>   if (!adev->bios) {
>   dev_err(adev->dev, "no vBIOS found\n");
>   r = -EINVAL;
> --
> 2.17.1
> 
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

RE: [PATCH] drm/amdgpu:skip ASIC_INIT when posting card on vg20

2018-12-10 Thread Xu, Feifei

In fact, in multigpu cases, the original logical in 
amdgpu_device_need_post()->amdgpu_atombios_scratch_need_asic_init() return TRUE.
The logical is to: read ATOM_S7_ASIC_INIT_COMPLETE_MASK bit which is notified 
by BIOS to driver that if post is needed.

After the ASIC_INIT table moved to psp, though post is needed (the 
ATOM_S7_ASIC_INIT_COMPLETE_MASK is unset), we still need to skip the post in 
driver. 

Regards,
Feifei

-Original Message-
From: Qu, Jim  
Sent: 2018年12月10日 17:42
To: Quan, Evan ; Xu, Feifei ; 
amd-gfx@lists.freedesktop.org
Subject: 答复: [PATCH] drm/amdgpu:skip ASIC_INIT when posting card on vg20

 I think it is better move it into amdgpu_device_need_post() as a negative 
condition.

Thanks
JimQu


发件人: amd-gfx  代表 Quan, Evan 

发送时间: 2018年12月10日 14:33:58
收件人: Xu, Feifei; amd-gfx@lists.freedesktop.org
抄送: Xu, Feifei
主题: RE: [PATCH] drm/amdgpu:skip ASIC_INIT when posting card on vg20

Reviewed-by: Evan Quan 

> -Original Message-
> From: amd-gfx  On Behalf Of 
> Feifei Xu
> Sent: 2018年12月10日 14:17
> To: amd-gfx@lists.freedesktop.org
> Cc: Xu, Feifei ; Quan, Evan 
> Subject: [PATCH] drm/amdgpu:skip ASIC_INIT when posting card on vg20
>
> On vega20, the job of executing the ASIC_INIT table when posting card 
> is moved to psp. Skip the atombios's ASIC_INIT on vega20 when posting card.
>
> Change-Id: Id1d3c0a0d19296d5ed804de7edf5b09b8d38c0a5
> Signed-off-by: Feifei Xu 
> Tested-by: Candice Li 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index ef36cc595985..a375d2ac112f 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -2547,7 +2547,7 @@ int amdgpu_device_init(struct amdgpu_device 
> *adev,
>   amdgpu_device_detect_sriov_bios(adev);
>
>   /* Post card if necessary */
> - if (amdgpu_device_need_post(adev)) {
> + if ((adev->asic_type != CHIP_VEGA20) &&
> amdgpu_device_need_post(adev))
> +{
>   if (!adev->bios) {
>   dev_err(adev->dev, "no vBIOS found\n");
>   r = -EINVAL;
> --
> 2.17.1
>
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: 答复: [PATCH] drm/amdgpu:skip ASIC_INIT when posting card on vg20

2018-12-10 Thread Xu, Feifei

Agree that if also need to skip ASIC_INIT in s3/s4 in driver, should do the 
similar change as you mentioned. I am not quite sure the s3/s4 situation for 
now. Though from test, it might be the same case.
Will get back from vbios and have corresponding change. Thanks.

发自我的 iPhone

> 在 2018年12月10日，下午6:28，Qu, Jim  写道：
> 
> Hi Feifei,
> 
> When PSP perform ASIC_INIT during the whole boot up period?
> 
> In S4,  the asic will be reset, and it need be posted at the beginning of 
> resume. So it is better move the logic into amdgpu_device_need_post() if 
> ASIC_INIT is automatic performed by PSP during S4.
> 
> Thanks
> JimQu
> 
> ________
> 发件人: Xu, Feifei
> 发送时间: 2018年12月10日 18:09
> 收件人: Qu, Jim; Quan, Evan; amd-gfx@lists.freedesktop.org
> 主题: RE: [PATCH] drm/amdgpu:skip ASIC_INIT when posting card on vg20
> 
> In fact, in multigpu cases, the original logical in 
> amdgpu_device_need_post()->amdgpu_atombios_scratch_need_asic_init() return 
> TRUE.
> The logical is to: read ATOM_S7_ASIC_INIT_COMPLETE_MASK bit which is notified 
> by BIOS to driver that if post is needed.
> 
> After the ASIC_INIT table moved to psp, though post is needed (the 
> ATOM_S7_ASIC_INIT_COMPLETE_MASK is unset), we still need to skip the post in 
> driver.
> 
> Regards,
> Feifei
> 
> -Original Message-
> From: Qu, Jim 
> Sent: 2018年12月10日 17:42
> To: Quan, Evan ; Xu, Feifei ; 
> amd-gfx@lists.freedesktop.org
> Subject: 答复: [PATCH] drm/amdgpu:skip ASIC_INIT when posting card on vg20
> 
> I think it is better move it into amdgpu_device_need_post() as a negative 
> condition.
> 
> Thanks
> JimQu
> 
> ________
> 发件人: amd-gfx  代表 Quan, Evan 
> 
> 发送时间: 2018年12月10日 14:33:58
> 收件人: Xu, Feifei; amd-gfx@lists.freedesktop.org
> 抄送: Xu, Feifei
> 主题: RE: [PATCH] drm/amdgpu:skip ASIC_INIT when posting card on vg20
> 
> Reviewed-by: Evan Quan 
> 
>> -Original Message-
>> From: amd-gfx  On Behalf Of
>> Feifei Xu
>> Sent: 2018年12月10日 14:17
>> To: amd-gfx@lists.freedesktop.org
>> Cc: Xu, Feifei ; Quan, Evan 
>> Subject: [PATCH] drm/amdgpu:skip ASIC_INIT when posting card on vg20
>> 
>> On vega20, the job of executing the ASIC_INIT table when posting card
>> is moved to psp. Skip the atombios's ASIC_INIT on vega20 when posting card.
>> 
>> Change-Id: Id1d3c0a0d19296d5ed804de7edf5b09b8d38c0a5
>> Signed-off-by: Feifei Xu 
>> Tested-by: Candice Li 
>> ---
>> drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 2 +-
>> 1 file changed, 1 insertion(+), 1 deletion(-)
>> 
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> index ef36cc595985..a375d2ac112f 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> @@ -2547,7 +2547,7 @@ int amdgpu_device_init(struct amdgpu_device
>> *adev,
>>  amdgpu_device_detect_sriov_bios(adev);
>> 
>>  /* Post card if necessary */
>> - if (amdgpu_device_need_post(adev)) {
>> + if ((adev->asic_type != CHIP_VEGA20) &&
>> amdgpu_device_need_post(adev))
>> +{
>>  if (!adev->bios) {
>>  dev_err(adev->dev, "no vBIOS found\n");
>>  r = -EINVAL;
>> --
>> 2.17.1
>> 
>> ___
>> amd-gfx mailing list
>> amd-gfx@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

RE: [PATCH] drm/amdgpu: drop fclk/gfxclk ratio setting

2018-12-11 Thread Xu, Feifei

Acked-by: Feifei Xu 

-Original Message-
From: Evan Quan  
Sent: Wednesday, December 12, 2018 3:00 PM
To: amd-gfx@lists.freedesktop.org
Cc: Xu, Feifei ; Feng, Kenneth ; Quan, 
Evan 
Subject: [PATCH] drm/amdgpu: drop fclk/gfxclk ratio setting

Since this is not needed any more on the latest SMC firmware.

Change-Id: Id8a34261ba04381f6141ee18cd56b1c4a72c01bd
Signed-off-by: Evan Quan 
---
 drivers/gpu/drm/amd/powerplay/hwmgr/vega20_hwmgr.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/vega20_hwmgr.c 
b/drivers/gpu/drm/amd/powerplay/hwmgr/vega20_hwmgr.c
index 247bf9dbec5d..2e99ecf4ab76 100644
--- a/drivers/gpu/drm/amd/powerplay/hwmgr/vega20_hwmgr.c
+++ b/drivers/gpu/drm/amd/powerplay/hwmgr/vega20_hwmgr.c
@@ -130,7 +130,7 @@ static void vega20_set_default_registry_data(struct 
pp_hwmgr *hwmgr)
data->registry_data.disable_auto_wattman = 1;
data->registry_data.auto_wattman_debug = 0;
data->registry_data.auto_wattman_sample_period = 100;
-   data->registry_data.fclk_gfxclk_ratio = 0x3F6D;
+   data->registry_data.fclk_gfxclk_ratio = 0;
data->registry_data.auto_wattman_threshold = 50;
data->registry_data.gfxoff_controlled_by_driver = 1;
data->gfxoff_allowed = false;
-- 
2.19.2

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

RE: [PATCH 2/2] drm/amdgpu/psp: Fix can't detect psp INVOKE command failed

2018-12-17 Thread Xu, Feifei

Reviewed-by: Feifei Xu 

-Original Message-
From: amd-gfx  On Behalf Of Xiangliang Yu
Sent: Thursday, December 13, 2018 3:42 PM
To: amd-gfx@lists.freedesktop.org
Cc: Yu, Xiangliang 
Subject: [PATCH 2/2] drm/amdgpu/psp: Fix can't detect psp INVOKE command failed

There isn't ucode when executing INVOKE command, so current code can't check 
the failure of INVOKE command.

Remove the ucode check.

Signed-off-by: Xiangliang Yu 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 11 +++
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
index 2f126ea7..7f5ce37 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
@@ -140,10 +140,13 @@ psp_cmd_submit_buf(struct psp_context *psp,
while (*((unsigned int *)psp->fence_buf) != index)
msleep(1);
 
-   /* the status field must be 0 after FW is loaded */
-   if (ucode && psp->cmd_buf_mem->resp.status) {
-   DRM_ERROR("failed loading with status (%d) and ucode id (%d)\n",
- psp->cmd_buf_mem->resp.status, ucode->ucode_id);
+   /* the status field must be 0 after psp command completion */
+   if (psp->cmd_buf_mem->resp.status) {
+   if (ucode)
+   DRM_ERROR("failed to load ucode id (%d) ",
+ ucode->ucode_id);
+   DRM_ERROR("psp command failed and response status is (%d)\n",
+ psp->cmd_buf_mem->resp.status);
return -EINVAL;
}
 
--
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

RE: [PATCH 1/2] drm/amdgpu/psp: Fix to get wrong xgmi session id

2018-12-17 Thread Xu, Feifei

Hi Xiangliang,

Could you add more comment on the session_id in the commit message? Like using 
the session_id to distinguish each VF v.s. PF etc.

With that added,Reviewed-by: Feifei Xu

Thanks
Feifei

-Original Message-
From: amd-gfx  On Behalf Of Xiangliang Yu
Sent: Thursday, December 13, 2018 3:41 PM
To: amd-gfx@lists.freedesktop.org
Cc: Yu, Xiangliang 
Subject: [PATCH 1/2] drm/amdgpu/psp: Fix to get wrong xgmi session id

xGMI session id should get from response buffer, correct it.

Signed-off-by: Xiangliang Yu 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
index 8fab0d6..2f126ea7 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
@@ -147,6 +147,9 @@ psp_cmd_submit_buf(struct psp_context *psp,
return -EINVAL;
}
 
+   /* get xGMI session id from response buffer */
+   cmd->resp.session_id = psp->cmd_buf_mem->resp.session_id;
+
if (ucode) {
ucode->tmr_mc_addr_lo = psp->cmd_buf_mem->resp.fw_addr_lo;
ucode->tmr_mc_addr_hi = psp->cmd_buf_mem->resp.fw_addr_hi;
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

RE: [PATCH 2/2] drm/amdgpu/nbio7.4: add hw bug workaround for vega20

2018-12-19 Thread Xu, Feifei

Series Acked-by: Feifei Xu 

-Original Message-
From: amd-gfx  On Behalf Of Alex Deucher
Sent: Thursday, December 20, 2018 7:09 AM
To: amd-gfx@lists.freedesktop.org
Cc: Deucher, Alexander 
Subject: [PATCH 2/2] drm/amdgpu/nbio7.4: add hw bug workaround for vega20

Configure PCIE_CI_CNTL to work around a hw bug that affects some multi-GPU 
compute workloads.

Signed-off-by: Alex Deucher 
---
 drivers/gpu/drm/amd/amdgpu/nbio_v7_4.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/nbio_v7_4.c 
b/drivers/gpu/drm/amd/amdgpu/nbio_v7_4.c
index f8cee95d61cc..4cd31a276dcd 100644
--- a/drivers/gpu/drm/amd/amdgpu/nbio_v7_4.c
+++ b/drivers/gpu/drm/amd/amdgpu/nbio_v7_4.c
@@ -31,6 +31,7 @@
 
 #define smnCPM_CONTROL 
 0x11180460
 #define smnPCIE_CNTL2  
 0x11180070
+#define smnPCIE_CI_CNTL
 0x11180080
 
 static u32 nbio_v7_4_get_rev_id(struct amdgpu_device *adev)  { @@ -222,7 
+223,13 @@ static void nbio_v7_4_detect_hw_virt(struct amdgpu_device *adev)
 
 static void nbio_v7_4_init_registers(struct amdgpu_device *adev)  {
+   uint32_t def, data;
+
+   def = data = RREG32_PCIE(smnPCIE_CI_CNTL);
+   data = REG_SET_FIELD(data, PCIE_CI_CNTL, CI_SLV_ORDERING_DIS, 1);
 
+   if (def != data)
+   WREG32_PCIE(smnPCIE_CI_CNTL, data);
 }
 
 const struct amdgpu_nbio_funcs nbio_v7_4_funcs = {
--
2.13.6

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

RE: [PATCH drm/amdgpu/psp: ignore psp reponse status] drm/amdgpu/psp: ignore psp reponse status

2019-01-14 Thread Xu, Feifei

Reviewed-by: Feifei Xu 

-Original Message-
From: amd-gfx  On Behalf Of Huang, Ray
Sent: Monday, January 14, 2019 4:28 PM
To: Liu, Aaron ; amd-gfx@lists.freedesktop.org; Yu, 
Xiangliang ; Deucher, Alexander 
; Zhang, Hawking 
Cc: Liu, Aaron ; Koenig, Christian 
Subject: RE: [PATCH drm/amdgpu/psp: ignore psp reponse status] drm/amdgpu/psp: 
ignore psp reponse status

Is some cases, response status is not 0 even there is no problem while the 
command is submitted.
Some version of PSP FW doesn't write 0 to that field. So here we would like to 
only print a warning instead of an error during psp initialization to avoid 
breaking hw_init.  

Reviewed-by: Huang Rui   


> -Original Message-
> From: amd-gfx [mailto:amd-gfx-boun...@lists.freedesktop.org] On Behalf 
> Of Aaron Liu
> Sent: Monday, January 14, 2019 4:17 PM
> To: amd-gfx@lists.freedesktop.org
> Cc: Liu, Aaron 
> Subject: [PATCH drm/amdgpu/psp: ignore psp reponse status]
> drm/amdgpu/psp: ignore psp reponse status
> 
> Don't return err if psp reponse status isn't zero
> 
> Change-Id: I680679983f972b6969f4949f1faafaf17fe996a6
> Signed-off-by: Aaron Liu 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 9 +
>  1 file changed, 5 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
> index 53c2d60..48778b3 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
> @@ -140,14 +140,15 @@ psp_cmd_submit_buf(struct psp_context *psp,
>   while (*((unsigned int *)psp->fence_buf) != index)
>   msleep(1);
> 
> - /* the status field must be 0 after psp command completion */
> + /* the status field should be 0 after psp command completion
> +  * if not, print WARN msg
> +  */
>   if (psp->cmd_buf_mem->resp.status) {
>   if (ucode)
> - DRM_ERROR("failed to load ucode id (%d) ",
> + DRM_WARN("failed to load ucode id (%d) ",
> ucode->ucode_id);
> - DRM_ERROR("psp command failed and response status is
> (%d)\n",
> + DRM_WARN("psp command failed and response status is
> (%d)\n",
> psp->cmd_buf_mem->resp.status);
> - return -EINVAL;
>   }
> 
>   /* get xGMI session id from response buffer */
> --
> 2.7.4
> 
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

RE: [PATCH 2/2] drm/amd/powerplay: override duty cycle on Vega20

2019-02-28 Thread Xu, Feifei

Ackced-by: Feifei Xu 

-Original Message-
From: amd-gfx  On Behalf Of Evan Quan
Sent: Thursday, February 28, 2019 6:32 PM
To: amd-gfx@lists.freedesktop.org
Cc: Quan, Evan 
Subject: [PATCH 2/2] drm/amd/powerplay: override duty cycle on Vega20

This is needed for the new SMC firmwares only.

Change-Id: I5934e5161ec53c1dd73cb1542ef6b738ad2e620c
Signed-off-by: Evan Quan 
---
 .../gpu/drm/amd/powerplay/hwmgr/vega20_hwmgr.c   | 16 
 drivers/gpu/drm/amd/powerplay/inc/vega20_ppsmc.h |  3 ++-
 2 files changed, 18 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/vega20_hwmgr.c 
b/drivers/gpu/drm/amd/powerplay/hwmgr/vega20_hwmgr.c
index 9aa7bec1b5fe..d35f60ab3404 100644
--- a/drivers/gpu/drm/amd/powerplay/hwmgr/vega20_hwmgr.c
+++ b/drivers/gpu/drm/amd/powerplay/hwmgr/vega20_hwmgr.c
@@ -828,6 +828,17 @@ static int vega20_override_pcie_parameters(struct pp_hwmgr 
*hwmgr)
return 0;
 }
 
+static int vega20_override_duty_cycle(struct pp_hwmgr *hwmgr) {
+   struct amdgpu_device *adev = (struct amdgpu_device *)(hwmgr->adev);
+   int ret = 0;
+
+   if (adev->pm.fw_version >= 0x00282700)
+   ret = smum_send_msg_to_smc(hwmgr, PPSMC_MSG_OverrideDutyCycle);
+
+   return ret;
+}
+
 static int vega20_set_allowed_featuresmask(struct pp_hwmgr *hwmgr)  {
struct vega20_hwmgr *data =
@@ -1644,6 +1655,11 @@ static int vega20_enable_dpm_tasks(struct pp_hwmgr 
*hwmgr)
"[EnableDPMTasks] Failed to enable all smu features!",
return result);
 
+   result = vega20_override_duty_cycle(hwmgr);
+   PP_ASSERT_WITH_CODE(!result,
+   "[EnableDPMTasks] Failed to override duty cycle!",
+   return result);
+
result = vega20_override_pcie_parameters(hwmgr);
PP_ASSERT_WITH_CODE(!result,
"[EnableDPMTasks] Failed to override pcie parameters!", 
diff --git a/drivers/gpu/drm/amd/powerplay/inc/vega20_ppsmc.h 
b/drivers/gpu/drm/amd/powerplay/inc/vega20_ppsmc.h
index 4f63a736ea0e..4a1e01f04cf5 100644
--- a/drivers/gpu/drm/amd/powerplay/inc/vega20_ppsmc.h
+++ b/drivers/gpu/drm/amd/powerplay/inc/vega20_ppsmc.h
@@ -119,7 +119,8 @@
 #define PPSMC_MSG_PrepareMp1ForShutdown  0x5A
 #define PPSMC_MSG_SetMGpuFanBoostLimitRpm0x5D
 #define PPSMC_MSG_GetAVFSVoltageByDpm0x5F
-#define PPSMC_Message_Count  0x60
+#define PPSMC_MSG_OverrideDutyCycle  0x64
+#define PPSMC_Message_Count  0x65
 
 typedef uint32_t PPSMC_Result;
 typedef uint32_t PPSMC_Msg;
--
2.21.0

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

RE: [PATCH 1/2] drm/amd/powerplay: correct power reading on fiji

2019-02-28 Thread Xu, Feifei

Reviewed-by: Feifei Xu 

-Original Message-
From: amd-gfx  On Behalf Of Evan Quan
Sent: Thursday, February 28, 2019 6:32 PM
To: amd-gfx@lists.freedesktop.org
Cc: Quan, Evan 
Subject: [PATCH 1/2] drm/amd/powerplay: correct power reading on fiji

Set sampling period as 500ms to provide a smooth power reading output. Also, 
correct the register for power reading.

Change-Id: I13935f3e7fcd026d34aa6a68cf7f683dc6785ab7
Signed-off-by: Evan Quan 
---
 drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c 
b/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c
index 48187acac59e..83d3d935f3ac 100644
--- a/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c
+++ b/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c
@@ -3491,14 +3491,14 @@ static int smu7_get_gpu_power(struct pp_hwmgr *hwmgr, 
u32 *query)
 
smum_send_msg_to_smc(hwmgr, PPSMC_MSG_PmStatusLogStart);
cgs_write_ind_register(hwmgr->device, CGS_IND_REG__SMC,
-   ixSMU_PM_STATUS_94, 0);
+   ixSMU_PM_STATUS_95, 0);
 
for (i = 0; i < 10; i++) {
-   mdelay(1);
+   mdelay(500);
smum_send_msg_to_smc(hwmgr, PPSMC_MSG_PmStatusLogSample);
tmp = cgs_read_ind_register(hwmgr->device,
CGS_IND_REG__SMC,
-   ixSMU_PM_STATUS_94);
+   ixSMU_PM_STATUS_95);
if (tmp != 0)
break;
}
--
2.21.0

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

RE: [PATCH] drm/amdgpu: return err if input is not valid

2019-03-04 Thread Xu, Feifei

Reviewed-by: Feifei Xu 

-Original Message-
From: Pan, Xinhui  
Sent: Tuesday, March 5, 2019 3:11 PM
To: amd-gfx@lists.freedesktop.org
Cc: Zhang, Hawking ; Xu, Feifei ; 
Deucher, Alexander 
Subject: [PATCH] drm/amdgpu: return err if input is not valid

Signed-off-by: xinhui pan 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
index 8b1088dac686..1df6b03a3680 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
@@ -240,6 +240,9 @@ static int amdgpu_ras_debugfs_ctrl_parse_data(struct file 
*f,
op = 1;
else if (sscanf(str, "inject %32s %8s", block_name, err) == 2)
op = 2;
+   else if (sscanf(str, "%32s", block_name) == 1)
+   /* ascii string, but commands are not matched. */
+   return -EINVAL;
 
if (op != -1) {
if (amdgpu_ras_find_block_id_by_name(block_name, &block_id)) @@ 
-352,6 +355,9 @@ static ssize_t amdgpu_ras_debugfs_ctrl_write(struct file *f, 
const char __user *
case 2:
ret = amdgpu_ras_error_inject(adev, &data.inject);
break;
+   default:
+   ret = -EINVAL;
+   break;
};
 
if (ret)
--
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

RE: [PATCH] drm/amdgpu: Fix ras debugfs data parse

2019-03-11 Thread Xu, Feifei

Reviewed-by: Feifei Xu mailto:feifei...@amd.com>>


From: Pan, Xinhui 
Sent: Monday, March 11, 2019 6:26 PM
To: amd-gfx@lists.freedesktop.org
Cc: Xu, Feifei ; Deucher, Alexander 

Subject: [PATCH] drm/amdgpu: Fix ras debugfs data parse

Unzero char is accepted by sscanf, so when data is structure but
unexpectedly return error invalid;

Signed-off-by: xinhui pan mailto:xinhui@amd.com>>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
index 74a65a61fd23..a5336d16aa4d 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
@@ -241,7 +241,7 @@ static int amdgpu_ras_debugfs_ctrl_parse_data(struct file 
*f,
op = 1;
 else if (sscanf(str, "inject %32s %8s", block_name, err) == 2)
op = 2;
- else if (sscanf(str, "%32s", block_name) == 1)
+else if (str[0] && str[1] && str[2] && str[3])
/* ascii string, but commands are not matched. */
return -EINVAL;

--
2.17.1
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

1 2 >

1 - 100 of 160 matches

Mail list logo