Re: [PATCH v2] docs/gpu: ci: update flake tests requirements

2024-09-26 Thread Vignesh Raman

Hi Maxime,

On 26/09/24 12:56, Maxime Ripard wrote:

On Thu, Sep 26, 2024 at 12:36:49PM GMT, Vignesh Raman wrote:

Update the documentation to require linking to a relevant GitLab
issue for each new flake entry instead of an email report. Added
specific GitLab issue URLs for i915, xe and other drivers.

Signed-off-by: Vignesh Raman 
---

v2:
- Add gitlab issue link for msm driver.

---
  Documentation/gpu/automated_testing.rst | 16 +++-
  1 file changed, 11 insertions(+), 5 deletions(-)

diff --git a/Documentation/gpu/automated_testing.rst 
b/Documentation/gpu/automated_testing.rst
index 2d5a28866afe..f918fe56f2b0 100644
--- a/Documentation/gpu/automated_testing.rst
+++ b/Documentation/gpu/automated_testing.rst
@@ -67,20 +67,26 @@ Lists the tests that for a given driver on a specific 
hardware revision are
  known to behave unreliably. These tests won't cause a job to fail regardless 
of
  the result. They will still be run.
  
-Each new flake entry must be associated with a link to the email reporting the

-bug to the author of the affected driver, the board name or Device Tree name of
-the board, the first kernel version affected, the IGT version used for tests,
-and an approximation of the failure rate.
+Each new flake entry must include a link to the relevant GitLab issue, the 
board
+name or Device Tree name, the first kernel version affected, the IGT version 
used
+for tests and an approximation of the failure rate.
  
  They should be provided under the following format::
  
-  # Bug Report: $LORE_OR_PATCHWORK_URL

+  # Bug Report: $GITLAB_ISSUE
# Board Name: broken-board.dtb
# Linux Version: 6.6-rc1
# IGT Version: 1.28-gd2af13d9f
# Failure Rate: 100
flaky-test
  
+The GitLab issue must include the logs and the pipeline link. Use the appropriate

+link below to create an issue.
+https://gitlab.freedesktop.org/drm/i915/kernel/-/issues for i915 driver
+https://gitlab.freedesktop.org/drm/xe/kernel/-/issues for xe driver
+https://gitlab.freedesktop.org/drm/msm/-/issues for msm driver
+https://gitlab.freedesktop.org/drm/misc/kernel/-/issues for other drivers
+


I can't comment for the others, but drm-misc at least still requires
reporting issues by mail, so, no, sorry, we can't switch to gitlab only
for now.


In https://gitlab.freedesktop.org/drm/ we have xe, i915, msm, nouveau, 
and amd (only for issues). In drm-ci we test i915, msm and amd.
So I will add GitLab links for these and for other drivers use email 
reporting. I will reword this section. Thanks.


Regards,
Vignesh


Re: [PATCH v1 4/9] drm/amd/pm: don't update runpm last_usage on debugfs getter

2024-09-26 Thread Lazar, Lijo



On 9/25/2024 1:24 PM, Pierre-Eric Pelloux-Prayer wrote:
> Reading pm values from the GPU shouldn't prevent it to be suspended
> by resetting the last active timestamp (eg: if an background app
> monitors GPU sensors every second, it would prevent the autosuspend
> sequence to trigger).
> 
> Tested-by: Mario Limonciello 
> Signed-off-by: Pierre-Eric Pelloux-Prayer 
> ---
>  drivers/gpu/drm/amd/pm/amdgpu_pm.c | 26 --
>  1 file changed, 26 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/pm/amdgpu_pm.c 
> b/drivers/gpu/drm/amd/pm/amdgpu_pm.c
> index 042a4dd1bd6a..c8f34b1a4d8e 100644
> --- a/drivers/gpu/drm/amd/pm/amdgpu_pm.c
> +++ b/drivers/gpu/drm/amd/pm/amdgpu_pm.c
> @@ -151,7 +151,6 @@ static ssize_t amdgpu_get_power_dpm_state(struct device 
> *dev,
>  
>   amdgpu_dpm_get_current_power_state(adev, &pm);
>  
> - pm_runtime_mark_last_busy(ddev->dev);
>   pm_runtime_put_autosuspend(ddev->dev);
>  
>   return sysfs_emit(buf, "%s\n",
> @@ -275,7 +274,6 @@ static ssize_t 
> amdgpu_get_power_dpm_force_performance_level(struct device *dev,
>  
>   level = amdgpu_dpm_get_performance_level(adev);
>  
> - pm_runtime_mark_last_busy(ddev->dev);
>   pm_runtime_put_autosuspend(ddev->dev);
>  
>   return sysfs_emit(buf, "%s\n",
> @@ -373,7 +371,6 @@ static ssize_t amdgpu_get_pp_num_states(struct device 
> *dev,
>   if (amdgpu_dpm_get_pp_num_states(adev, &data))
>   memset(&data, 0, sizeof(data));
>  
> - pm_runtime_mark_last_busy(ddev->dev);
>   pm_runtime_put_autosuspend(ddev->dev);
>  
>   buf_len = sysfs_emit(buf, "states: %d\n", data.nums);
> @@ -410,7 +407,6 @@ static ssize_t amdgpu_get_pp_cur_state(struct device *dev,
>  
>   ret = amdgpu_dpm_get_pp_num_states(adev, &data);
>  
> - pm_runtime_mark_last_busy(ddev->dev);
>   pm_runtime_put_autosuspend(ddev->dev);
>  
>   if (ret)
> @@ -536,7 +532,6 @@ static ssize_t amdgpu_get_pp_table(struct device *dev,
>  
>   size = amdgpu_dpm_get_pp_table(adev, &table);
>  
> - pm_runtime_mark_last_busy(ddev->dev);
>   pm_runtime_put_autosuspend(ddev->dev);
>  
>   if (size <= 0)
> @@ -866,7 +861,6 @@ static ssize_t amdgpu_get_pp_od_clk_voltage(struct device 
> *dev,
>   if (size == 0)
>   size = sysfs_emit(buf, "\n");
>  
> - pm_runtime_mark_last_busy(ddev->dev);
>   pm_runtime_put_autosuspend(ddev->dev);
>  
>   return size;
> @@ -944,7 +938,6 @@ static ssize_t amdgpu_get_pp_features(struct device *dev,
>   if (size <= 0)
>   size = sysfs_emit(buf, "\n");
>  
> - pm_runtime_mark_last_busy(ddev->dev);
>   pm_runtime_put_autosuspend(ddev->dev);
>  
>   return size;
> @@ -1014,7 +1007,6 @@ static ssize_t amdgpu_get_pp_dpm_clock(struct device 
> *dev,
>   if (size == 0)
>   size = sysfs_emit(buf, "\n");
>  
> - pm_runtime_mark_last_busy(ddev->dev);
>   pm_runtime_put_autosuspend(ddev->dev);
>  
>   return size;
> @@ -1259,7 +1251,6 @@ static ssize_t amdgpu_get_pp_sclk_od(struct device *dev,
>  
>   value = amdgpu_dpm_get_sclk_od(adev);
>  
> - pm_runtime_mark_last_busy(ddev->dev);
>   pm_runtime_put_autosuspend(ddev->dev);
>  
>   return sysfs_emit(buf, "%d\n", value);
> @@ -1317,7 +1308,6 @@ static ssize_t amdgpu_get_pp_mclk_od(struct device *dev,
>  
>   value = amdgpu_dpm_get_mclk_od(adev);
>  
> - pm_runtime_mark_last_busy(ddev->dev);
>   pm_runtime_put_autosuspend(ddev->dev);
>  
>   return sysfs_emit(buf, "%d\n", value);
> @@ -1397,7 +1387,6 @@ static ssize_t amdgpu_get_pp_power_profile_mode(struct 
> device *dev,
>   if (size <= 0)
>   size = sysfs_emit(buf, "\n");
>  
> - pm_runtime_mark_last_busy(ddev->dev);
>   pm_runtime_put_autosuspend(ddev->dev);
>  
>   return size;
> @@ -1485,7 +1474,6 @@ static int amdgpu_hwmon_get_sensor_generic(struct 
> amdgpu_device *adev,
>   /* get the sensor value */
>   r = amdgpu_dpm_read_sensor(adev, sensor, query, &size);
>  
> - pm_runtime_mark_last_busy(adev_to_drm(adev)->dev);
>   pm_runtime_put_autosuspend(adev_to_drm(adev)->dev);
>  
>   return r;
> @@ -1601,7 +1589,6 @@ static ssize_t amdgpu_get_pcie_bw(struct device *dev,
>  
>   amdgpu_asic_get_pcie_usage(adev, &count0, &count1);
>  
> - pm_runtime_mark_last_busy(ddev->dev);
>   pm_runtime_put_autosuspend(ddev->dev);
>  
>   return sysfs_emit(buf, "%llu %llu %i\n",
> @@ -1734,7 +1721,6 @@ static ssize_t amdgpu_get_apu_thermal_cap(struct device 
> *dev,
>   else
>   size = sysfs_emit(buf, "failed to get thermal limit\n");
>  
> - pm_runtime_mark_last_busy(ddev->dev);
>   pm_runtime_put_autosuspend(ddev->dev);
>  
>   return size;
> @@ -1807,7 +1793,6 @@ static ssize_t amdgpu_get_pm_metrics(struct device *dev,
>  
>   size = amdgpu_dpm_get_pm_metrics(adev, buf, PAGE_SIZE);
>  
> - pm_runtime_mark_last_busy(ddev->dev);
>   pm_runtime_put_autosuspend

Re: radeon ARUBA NULL pointer dereference

2024-09-26 Thread Ewan Milne
Follow-up, qla2xxx appears to be fixed, most recent test was with:

commit 684a64bf32b6e488004e0ad7f0d7e922798f65b6 (HEAD -> master,
origin/master, origin/HEAD)
Merge: f7fccaa77271 68898131d2df
Author: Linus Torvalds 
Date:   Tue Sep 24 15:44:18 2024 -0700

Merge tag 'nfs-for-6.12-1' of
git://git.linux-nfs.org/projects/anna/linux-nfs

I suppose the most likely fix was probably this one but I do not have
the HW to verify
(report I got was on an AMD EPYC 7262):

commit b348b6d17fd1d5d89b86db602f02bea54a754bd8
Author: Leon Romanovsky 
Date:   Sun Sep 22 21:09:48 2024 +0300

dma-mapping: report unlimited DMA addressing in IOMMU DMA path

-Ewan

On Tue, Sep 24, 2024 at 3:30 PM Ewan Milne  wrote:
>
> I think we are seeing a similar problem w/qla2xxx panicing at boot:
>



RE: [PATCH 6/7] drm/amdgpu: Check gmc requirement for reset on init

2024-09-26 Thread Xu, Feifei
[AMD Official Use Only - AMD Internal Distribution Only]

>>+ refresh = (adev->init_lvl->level != AMDGPU_INIT_LEVEL_MINIMAL_XGMI) &&
>>+   (adev->gmc.reset_flags & AMDGPU_GMC_INIT_RESET_NPS);

Is there a corner case that reloading with a different version tos and 
refreshing nps change co-exist?

Thanks,
Feifei

-Original Message-
From: amd-gfx  On Behalf Of Lijo Lazar
Sent: Tuesday, September 24, 2024 1:57 PM
To: amd-gfx@lists.freedesktop.org
Cc: Zhang, Hawking ; Deucher, Alexander 
; Koenig, Christian ; 
Bhardwaj, Rajneesh ; Errabolu, Ramesh 

Subject: [PATCH 6/7] drm/amdgpu: Check gmc requirement for reset on init

Add a callback to check if there is any condition detected by GMC block for 
reset on init. One case is if a pending NPS change request is detected. If 
reset is done because of NPS switch, refresh NPS info from discovery table.

Signed-off-by: Lijo Lazar 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 13 -  
drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h |  5 +
 drivers/gpu/drm/amd/amdgpu/soc15.c  |  2 ++
 3 files changed, 19 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
index 21f1e65c9dc9..011fe3a847d0 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
@@ -1261,12 +1261,15 @@ int amdgpu_gmc_get_nps_memranges(struct amdgpu_device 
*adev,
struct amdgpu_gmc_memrange *ranges;
int range_cnt, ret, i, j;
uint32_t nps_type;
+   bool refresh;

if (!mem_ranges)
return -EINVAL;

+   refresh = (adev->init_lvl->level != AMDGPU_INIT_LEVEL_MINIMAL_XGMI) &&
+ (adev->gmc.reset_flags & AMDGPU_GMC_INIT_RESET_NPS);
ret = amdgpu_discovery_get_nps_info(adev, &nps_type, &ranges,
-   &range_cnt, false);
+   &range_cnt, refresh);

if (ret)
return ret;
@@ -1392,3 +1395,11 @@ void amdgpu_gmc_prepare_nps_mode_change(struct 
amdgpu_device *adev)
adev->dev,
"NPS mode change request done, reload driver to 
complete the change\n");  }
+
+bool amdgpu_gmc_need_reset_on_init(struct amdgpu_device *adev) {
+   if (adev->gmc.gmc_funcs->need_reset_on_init)
+   return adev->gmc.gmc_funcs->need_reset_on_init(adev);
+
+   return false;
+}
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h
index b13d6adb5efd..d4cd247fe574 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h
@@ -78,6 +78,8 @@ enum amdgpu_memory_partition {
 BIT(AMDGPU_NPS3_PARTITION_MODE) | BIT(AMDGPU_NPS4_PARTITION_MODE) | \
 BIT(AMDGPU_NPS6_PARTITION_MODE) | BIT(AMDGPU_NPS8_PARTITION_MODE))

+#define AMDGPU_GMC_INIT_RESET_NPS  BIT(0)
+
 /*
  * GMC page fault information
  */
@@ -169,6 +171,7 @@ struct amdgpu_gmc_funcs {
/* Request NPS mode */
int (*request_mem_partition_mode)(struct amdgpu_device *adev,
  int nps_mode);
+   bool (*need_reset_on_init)(struct amdgpu_device *adev);
 };

 struct amdgpu_xgmi_ras {
@@ -314,6 +317,7 @@ struct amdgpu_gmc {
const struct amdgpu_gmc_funcs   *gmc_funcs;
enum amdgpu_memory_partitionrequested_nps_mode;
uint32_t supported_nps_modes;
+   uint32_t reset_flags;

struct amdgpu_xgmi xgmi;
struct amdgpu_irq_src   ecc_irq;
@@ -468,5 +472,6 @@ int amdgpu_gmc_get_nps_memranges(struct amdgpu_device 
*adev,  int amdgpu_gmc_request_memory_partition(struct amdgpu_device *adev,
int nps_mode);
 void amdgpu_gmc_prepare_nps_mode_change(struct amdgpu_device *adev);
+bool amdgpu_gmc_need_reset_on_init(struct amdgpu_device *adev);

 #endif
diff --git a/drivers/gpu/drm/amd/amdgpu/soc15.c 
b/drivers/gpu/drm/amd/amdgpu/soc15.c
index 619933f252aa..97ca4931a7ef 100644
--- a/drivers/gpu/drm/amd/amdgpu/soc15.c
+++ b/drivers/gpu/drm/amd/amdgpu/soc15.c
@@ -833,6 +833,8 @@ static bool soc15_need_reset_on_init(struct amdgpu_device 
*adev)

if (amdgpu_psp_tos_reload_needed(adev))
return true;
+   if (amdgpu_gmc_need_reset_on_init(adev))
+   return true;
/* Just return false for soc15 GPUs.  Reset does not seem to
 * be necessary.
 */
--
2.25.1



Re: [PATCH v2 01/08] drm/amdgpu: Implement userqueue signal/wait IOCTL

2024-09-26 Thread Christian König

Am 26.09.24 um 11:31 schrieb Paneer Selvam, Arunpravin:

Hi Christian,

On 9/26/2024 2:57 PM, Christian König wrote:

Am 25.09.24 um 21:59 schrieb Arunpravin Paneer Selvam:

[SNIP]
+int amdgpu_userq_wait_ioctl(struct drm_device *dev, void *data,
+    struct drm_file *filp)
+{
+    struct drm_amdgpu_userq_fence_info *fence_info = NULL;
+    struct drm_amdgpu_userq_wait *wait_info = data;
+    u32 *syncobj_handles, *bo_handles;
+    struct dma_fence **fences = NULL;
+    u32 num_syncobj, num_bo_handles;
+    struct drm_gem_object **gobj;
+    struct drm_exec exec;
+    int r, i, entry, cnt;
+    u64 num_fences = 0;
+
+    num_bo_handles = wait_info->num_bo_handles;
+    bo_handles = 
memdup_user(u64_to_user_ptr(wait_info->bo_handles_array),

+ sizeof(u32) * num_bo_handles);
+    if (IS_ERR(bo_handles))
+    return PTR_ERR(bo_handles);
+
+    num_syncobj = wait_info->num_syncobj_handles;
+    syncobj_handles = 
memdup_user(u64_to_user_ptr(wait_info->syncobj_handles_array),

+  sizeof(u32) * num_syncobj);
+    if (IS_ERR(syncobj_handles)) {
+    r = PTR_ERR(syncobj_handles);
+    goto free_bo_handles;
+    }
+
+    /* Array of GEM object handles */
+    gobj = kmalloc_array(num_bo_handles, sizeof(*gobj), GFP_KERNEL);
+    if (!gobj) {
+    r = -ENOMEM;
+    goto free_syncobj_handles;
+    }
+
+    for (entry = 0; entry < num_bo_handles; entry++) {
+    gobj[entry] = drm_gem_object_lookup(filp, bo_handles[entry]);
+    if (!gobj[entry]) {
+    r = -ENOENT;
+    goto put_gobj;
+    }
+    }
+
+    drm_exec_init(&exec, DRM_EXEC_INTERRUPTIBLE_WAIT, 0);
+    drm_exec_until_all_locked(&exec) {
+    r = drm_exec_prepare_array(&exec, gobj, num_bo_handles, 0);
+    drm_exec_retry_on_contention(&exec);
+    if (r) {
+    drm_exec_fini(&exec);
+    goto put_gobj;
+    }
+    }
+
+    if (!wait_info->num_fences) {
+    /* Count syncobj's fence */
+    for (i = 0; i < num_syncobj; i++) {
+    struct dma_fence *fence;
+
+    r = drm_syncobj_find_fence(filp, syncobj_handles[i],
+   0, 0, &fence);
+    dma_fence_put(fence);
+
+    if (r || !fence)
+    continue;
+
+    num_fences++;
+    }
+
+    /* Count GEM objects fence */
+    for (i = 0; i < num_bo_handles; i++) {
+    struct dma_resv_iter resv_cursor;
+    struct dma_fence *fence;
+
+    dma_resv_for_each_fence(&resv_cursor, gobj[i]->resv,
+ dma_resv_usage_rw(wait_info->bo_wait_flags &
+    AMDGPU_USERQ_BO_WRITE), fence)
+    num_fences++;


We should probably adjust the UAPI here once more.

The problem is that we only provide the AMDGPU_USERQ_BO_WRITE for the 
whole IOCTL instead of per BO.


So the best approach would probably be to drop the 
AMDGPU_USERQ_BO_WRITE flag and split up the array of BOs into readers 
and writers.


Can you work on that Arun? Shouldn't be more than a bit typing exercise.

Sure, I will modify and send the next version of this file.


Thanks.

In the meantime I'm going to review the rest of the series, so there 
could be more comments. But please update the UAPI first.


Regards,
Christian.



Thanks,
Arun.


Thanks,
Christian.







Re: [PATCH v2] drm/sched: Further optimise drm_sched_entity_push_job

2024-09-26 Thread Philipp Stanner
On Mon, 2024-09-23 at 15:35 +0100, Tvrtko Ursulin wrote:
> 
> Ping Christian and Philipp - reasonably happy with v2? I think it's
> the 
> only unreviewed patch from the series.

Howdy,

sry for the delay, I had been traveling.

I have a few nits below regarding the commit message. Besides, I'm OK
with that, thx for your work :)

> 
> Regards,
> 
> Tvrtko
> 
> On 16/09/2024 18:30, Tvrtko Ursulin wrote:
> > From: Tvrtko Ursulin 
> > 
> > Having removed one re-lock cycle on the entity->lock in a patch
> > titled
> > "drm/sched: Optimise drm_sched_entity_push_job", 
> > with only a tiny bit
> > larger refactoring we can do the same optimisation 

Well, the commit message does not state which optimization that is. One
would have to look for the previous patch, which you apparently cannot
provide a commit ID for yet because it's not in Big Boss's branch.

In this case I am for including a sentence about what is being
optimized also because

> > on the rq->lock.
> > (Currently both drm_sched_rq_add_entity() and
> > drm_sched_rq_update_fifo_locked() take and release the same lock.)
> > 
> > To achieve this we make drm_sched_rq_update_fifo_locked() and

it's not clear what the "this" that's being achieved is.

> > drm_sched_rq_add_entity() expect the rq->lock to be held.
> > 
> > We also align drm_sched_rq_update_fifo_locked(),
> > drm_sched_rq_add_entity() and
> > drm_sched_rq_remove_fifo_locked() function signatures, by adding rq
> > as a
> > parameter to the latter.
> > 
> > v2:
> >   * Fix after rebase of the series.
> >   * Avoid naming incosistency between drm_sched_rq_add/remove.
> > (Christian)
> > 
> > Signed-off-by: Tvrtko Ursulin 

Reviewed-by: Philipp Stanner 

> > Cc: Christian König 
> > Cc: Alex Deucher 
> > Cc: Luben Tuikov 
> > Cc: Matthew Brost 
> > Cc: Philipp Stanner 
> > ---
> >   drivers/gpu/drm/scheduler/sched_entity.c | 12 --
> >   drivers/gpu/drm/scheduler/sched_main.c   | 29 ---
> > -
> >   include/drm/gpu_scheduler.h  |  3 ++-
> >   3 files changed, 26 insertions(+), 18 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/scheduler/sched_entity.c
> > b/drivers/gpu/drm/scheduler/sched_entity.c
> > index d982cebc6bee..8ace1f1ea66b 100644
> > --- a/drivers/gpu/drm/scheduler/sched_entity.c
> > +++ b/drivers/gpu/drm/scheduler/sched_entity.c
> > @@ -515,9 +515,14 @@ struct drm_sched_job
> > *drm_sched_entity_pop_job(struct drm_sched_entity *entity)
> >   
> >     next = to_drm_sched_job(spsc_queue_peek(&entity-
> > >job_queue));
> >     if (next) {
> > +   struct drm_sched_rq *rq;
> > +
> >     spin_lock(&entity->lock);
> > -   drm_sched_rq_update_fifo_locked(entity,
> > +   rq = entity->rq;
> > +   spin_lock(&rq->lock);
> > +   drm_sched_rq_update_fifo_locked(entity,
> > rq,
> >     next-
> > >submit_ts);
> > +   spin_unlock(&rq->lock);
> >     spin_unlock(&entity->lock);
> >     }
> >     }
> > @@ -618,11 +623,14 @@ void drm_sched_entity_push_job(struct
> > drm_sched_job *sched_job)
> >     sched = rq->sched;
> >   
> >     atomic_inc(sched->score);
> > +
> > +   spin_lock(&rq->lock);
> >     drm_sched_rq_add_entity(rq, entity);
> >   
> >     if (drm_sched_policy == DRM_SCHED_POLICY_FIFO)
> > -   drm_sched_rq_update_fifo_locked(entity,
> > submit_ts);
> > +   drm_sched_rq_update_fifo_locked(entity,
> > rq, submit_ts);
> >   
> > +   spin_unlock(&rq->lock);
> >     spin_unlock(&entity->lock);
> >   
> >     drm_sched_wakeup(sched, entity);
> > diff --git a/drivers/gpu/drm/scheduler/sched_main.c
> > b/drivers/gpu/drm/scheduler/sched_main.c
> > index 18a952f73ecb..5c83fb92bb89 100644
> > --- a/drivers/gpu/drm/scheduler/sched_main.c
> > +++ b/drivers/gpu/drm/scheduler/sched_main.c
> > @@ -153,17 +153,18 @@ static __always_inline bool
> > drm_sched_entity_compare_before(struct rb_node *a,
> >     return ktime_before(ent_a->oldest_job_waiting, ent_b-
> > >oldest_job_waiting);
> >   }
> >   
> > -static inline void drm_sched_rq_remove_fifo_locked(struct
> > drm_sched_entity *entity)
> > +static void drm_sched_rq_remove_fifo_locked(struct

I think the commit message should contain a short sentence about why
you removed the inline.

AKA "As we're at it, remove the inline function specifier from
drm_sched_rq_remove_fifo_locked() because XYZ"


P.

> > drm_sched_entity *entity,
> > +       struct drm_sched_rq
> > *rq)
> >   {
> > -   struct drm_sched_rq *rq = entity->rq;
> > -
> >     if (!RB_EMPTY_NODE(&entity->rb_tree_node)) {
> >     rb_erase_cached(&entity->rb_tree_node, &rq-
> > >rb_tree_root);
> >     RB_CLEAR_NODE(&entity->rb_tree_node);
> >     }
> >   }
> >   
> > -void drm_sched_rq_update_fifo_locked(struct drm_sched

Re: [PATCH v2] drm/sched: Further optimise drm_sched_entity_push_job

2024-09-26 Thread Philipp Stanner
On Tue, 2024-09-24 at 14:02 +0200, Christian König wrote:
> Am 24.09.24 um 11:58 schrieb Tvrtko Ursulin:
> > 
> > On 24/09/2024 10:45, Tvrtko Ursulin wrote:
> > > 
> > > On 24/09/2024 09:20, Christian König wrote:
> > > > Am 16.09.24 um 19:30 schrieb Tvrtko Ursulin:
> > > > > From: Tvrtko Ursulin 
> > > > > 
> > > > > Having removed one re-lock cycle on the entity->lock in a
> > > > > patch titled
> > > > > "drm/sched: Optimise drm_sched_entity_push_job", with only a
> > > > > tiny bit
> > > > > larger refactoring we can do the same optimisation on the rq-
> > > > > >lock.
> > > > > (Currently both drm_sched_rq_add_entity() and
> > > > > drm_sched_rq_update_fifo_locked() take and release the same
> > > > > lock.)
> > > > > 
> > > > > To achieve this we make drm_sched_rq_update_fifo_locked() and
> > > > > drm_sched_rq_add_entity() expect the rq->lock to be held.
> > > > > 
> > > > > We also align drm_sched_rq_update_fifo_locked(),
> > > > > drm_sched_rq_add_entity() and
> > > > > drm_sched_rq_remove_fifo_locked() function signatures, by
> > > > > adding rq 
> > > > > as a
> > > > > parameter to the latter.
> > > > > 
> > > > > v2:
> > > > >   * Fix after rebase of the series.
> > > > >   * Avoid naming incosistency between
> > > > > drm_sched_rq_add/remove. 
> > > > > (Christian)
> > > > > 
> > > > > Signed-off-by: Tvrtko Ursulin 
> > > > > Cc: Christian König 
> > > > > Cc: Alex Deucher 
> > > > > Cc: Luben Tuikov 
> > > > > Cc: Matthew Brost 
> > > > > Cc: Philipp Stanner 
> > > > 
> > > > Reviewed-by: Christian König 
> > > 
> > > Thanks!
> > > 
> > > Are you okay to pull into drm-misc-next or we should do some more
> > > testing on this?
> > > 
> > > And/or should I resend the series once more in it's entirety so
> > > this 
> > > v2 is not a reply-to to the original?
> > 
> > I have to respin for the drm_sched_wakeup fix that landed.
> 
> When I should push the series to drm-misc-next then please send it to
> me 
> once more.
> 
> On the other hand we should now have to maintainers for that.

Yup, will pick up this responsibilities soonish. Danilo and I have been
on conference recently and I'm out of office soon for a bit, but you
can expect me / us to take over that work soonish in early autumn.

Regards,
P.

> 
> Regards,
> Christian.
> 
> > 
> > Regards,
> > 
> > Tvrtko
> > 
> > > 
> > > Regards,
> > > 
> > > Tvrtko
> > > 
> > > > 
> > > > > ---
> > > > >   drivers/gpu/drm/scheduler/sched_entity.c | 12 --
> > > > >   drivers/gpu/drm/scheduler/sched_main.c   | 29 
> > > > > 
> > > > >   include/drm/gpu_scheduler.h  |  3 ++-
> > > > >   3 files changed, 26 insertions(+), 18 deletions(-)
> > > > > 
> > > > > diff --git a/drivers/gpu/drm/scheduler/sched_entity.c 
> > > > > b/drivers/gpu/drm/scheduler/sched_entity.c
> > > > > index d982cebc6bee..8ace1f1ea66b 100644
> > > > > --- a/drivers/gpu/drm/scheduler/sched_entity.c
> > > > > +++ b/drivers/gpu/drm/scheduler/sched_entity.c
> > > > > @@ -515,9 +515,14 @@ struct drm_sched_job 
> > > > > *drm_sched_entity_pop_job(struct drm_sched_entity *entity)
> > > > >   next = 
> > > > > to_drm_sched_job(spsc_queue_peek(&entity->job_queue));
> > > > >   if (next) {
> > > > > +    struct drm_sched_rq *rq;
> > > > > +
> > > > >   spin_lock(&entity->lock);
> > > > > -    drm_sched_rq_update_fifo_locked(entity,
> > > > > +    rq = entity->rq;
> > > > > +    spin_lock(&rq->lock);
> > > > > +    drm_sched_rq_update_fifo_locked(entity, rq,
> > > > >   next->submit_ts);
> > > > > +    spin_unlock(&rq->lock);
> > > > >   spin_unlock(&entity->lock);
> > > > >   }
> > > > >   }
> > > > > @@ -618,11 +623,14 @@ void drm_sched_entity_push_job(struct 
> > > > > drm_sched_job *sched_job)
> > > > >   sched = rq->sched;
> > > > >   atomic_inc(sched->score);
> > > > > +
> > > > > +    spin_lock(&rq->lock);
> > > > >   drm_sched_rq_add_entity(rq, entity);
> > > > >   if (drm_sched_policy == DRM_SCHED_POLICY_FIFO)
> > > > > -    drm_sched_rq_update_fifo_locked(entity,
> > > > > submit_ts);
> > > > > +    drm_sched_rq_update_fifo_locked(entity, rq,
> > > > > submit_ts);
> > > > > +    spin_unlock(&rq->lock);
> > > > >   spin_unlock(&entity->lock);
> > > > >   drm_sched_wakeup(sched, entity);
> > > > > diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
> > > > > b/drivers/gpu/drm/scheduler/sched_main.c
> > > > > index 18a952f73ecb..5c83fb92bb89 100644
> > > > > --- a/drivers/gpu/drm/scheduler/sched_main.c
> > > > > +++ b/drivers/gpu/drm/scheduler/sched_main.c
> > > > > @@ -153,17 +153,18 @@ static __always_inline bool 
> > > > > drm_sched_entity_compare_before(struct rb_node *a,
> > > > >   return ktime_before(ent_a->oldest_job_waiting, 
> > > > > ent_b->oldest_job_waiting);
> > > > >   }
> > > > > -static inline void drm_sched_rq_remove_fifo_

[PATCH] drm/amdgpu: fix PTE copy corruption for sdma 7

2024-09-26 Thread Min, Frank
[AMD Official Use Only - AMD Internal Distribution Only]

From: Frank Min 

Without setting dcc bit, there is ramdon PTE copy corruption on sdma 7.

so add this bit and update the packet format accordingly.

Signed-off-by: Frank Min 
---
 drivers/gpu/drm/amd/amdgpu/sdma_v7_0.c | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v7_0.c 
b/drivers/gpu/drm/amd/amdgpu/sdma_v7_0.c
index cfd8e183ad50..a8763496aed3 100644
--- a/drivers/gpu/drm/amd/amdgpu/sdma_v7_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/sdma_v7_0.c
@@ -1080,13 +1080,16 @@ static void sdma_v7_0_vm_copy_pte(struct amdgpu_ib *ib,
unsigned bytes = count * 8;

ib->ptr[ib->length_dw++] = SDMA_PKT_COPY_LINEAR_HEADER_OP(SDMA_OP_COPY) 
|
-   SDMA_PKT_COPY_LINEAR_HEADER_SUB_OP(SDMA_SUBOP_COPY_LINEAR);
+   SDMA_PKT_COPY_LINEAR_HEADER_SUB_OP(SDMA_SUBOP_COPY_LINEAR) |
+   SDMA_PKT_COPY_LINEAR_HEADER_CPV(1);
+
ib->ptr[ib->length_dw++] = bytes - 1;
ib->ptr[ib->length_dw++] = 0; /* src/dst endian swap */
ib->ptr[ib->length_dw++] = lower_32_bits(src);
ib->ptr[ib->length_dw++] = upper_32_bits(src);
ib->ptr[ib->length_dw++] = lower_32_bits(pe);
ib->ptr[ib->length_dw++] = upper_32_bits(pe);
+   ib->ptr[ib->length_dw++] = 0;

 }

@@ -1744,7 +1747,7 @@ static void sdma_v7_0_set_buffer_funcs(struct 
amdgpu_device *adev)  }

 static const struct amdgpu_vm_pte_funcs sdma_v7_0_vm_pte_funcs = {
-   .copy_pte_num_dw = 7,
+   .copy_pte_num_dw = 8,
.copy_pte = sdma_v7_0_vm_copy_pte,
.write_pte = sdma_v7_0_vm_write_pte,
.set_pte_pde = sdma_v7_0_vm_set_pte_pde,
--
2.43.0




Re: [PATCH v1 5/9] drm/amd/pm: use pm_runtime_get_if_active for debugfs getters

2024-09-26 Thread Lazar, Lijo



On 9/25/2024 8:02 PM, Pierre-Eric Pelloux-Prayer wrote:
> 
> 
> Le 25/09/2024 à 15:35, Lazar, Lijo a écrit :
>>
>>
>> On 9/25/2024 1:24 PM, Pierre-Eric Pelloux-Prayer wrote:
>>> Don't wake up the GPU for reading pm values. Instead, take a runtime
>>> powermanagement ref when trying to read it iff the GPU is already
>>> awake.
>>>
>>> This avoids spurious wake ups (eg: from applets).
>>>
>>> We use pm_runtime_get_if_in_active(ign_usage_count=true) because we care
>>> about "is the GPU awake?" not about "is the GPU awake and something else
>>> prevents suspend?".
>>>
>>> Tested-by: Mario Limonciello 
>>> Signed-off-by: Pierre-Eric Pelloux-Prayer
>>> 
>>> ---
>>>   drivers/gpu/drm/amd/pm/amdgpu_pm.c | 138 ++---
>>>   1 file changed, 69 insertions(+), 69 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/pm/amdgpu_pm.c
>>> b/drivers/gpu/drm/amd/pm/amdgpu_pm.c
>>> index c8f34b1a4d8e..f1f339b75380 100644
>>> --- a/drivers/gpu/drm/amd/pm/amdgpu_pm.c
>>> +++ b/drivers/gpu/drm/amd/pm/amdgpu_pm.c
>>> @@ -145,9 +145,9 @@ static ssize_t amdgpu_get_power_dpm_state(struct
>>> device *dev,
>>>   if (adev->in_suspend && !adev->in_runpm)
>>>   return -EPERM;
>>>   -    ret = pm_runtime_resume_and_get(ddev->dev);
>>> -    if (ret < 0)
>>> -    return ret;
>>> +    ret = pm_runtime_get_if_active(ddev->dev, true);
>>> +    if (ret <= 0)
>>> +    return ret ?: -EPERM;
>>
>> Not all our devices support runtime PM.
>>
>> As per the API documentation -
>>
>> "-EINVAL is returned if runtime PM is disabled for the
>> device, in which case also the usage_count will remain unmodified."
> 
> This doc is about "dev->power.disable_depth > 0", not about runtime PM
> being disabled (like you would get by using pm_runtime_forbid).
> 
> When runtime PM is disabled, usage_count is always >= 1 and the status
> is RPM_ACTIVE so pm_runtime_get_if_active will succeed.
> 
> I tested on 2 dGPUs without runtime PM, and both seemed to work fine.
> 

Thanks for clarifying. Found that default enablement is taken care by
pci base driver.

https://elixir.bootlin.com/linux/v6.11/source/drivers/pci/pci.c#L3173

Thanks,
Lijo

> Pierre-Eric
> 
>>
>> If that's true, this won't be working on devices without runtime PM
>> support.
>>
>> Thanks,
>> Lijo
>>>     amdgpu_dpm_get_current_power_state(adev, &pm);
>>>   @@ -268,9 +268,9 @@ static ssize_t
>>> amdgpu_get_power_dpm_force_performance_level(struct device *dev,
>>>   if (adev->in_suspend && !adev->in_runpm)
>>>   return -EPERM;
>>>   -    ret = pm_runtime_resume_and_get(ddev->dev);
>>> -    if (ret < 0)
>>> -    return ret;
>>> +    ret = pm_runtime_get_if_active(ddev->dev, true);
>>> +    if (ret <= 0)
>>> +    return ret ?: -EPERM;
>>>     level = amdgpu_dpm_get_performance_level(adev);
>>>   @@ -364,9 +364,9 @@ static ssize_t amdgpu_get_pp_num_states(struct
>>> device *dev,
>>>   if (adev->in_suspend && !adev->in_runpm)
>>>   return -EPERM;
>>>   -    ret = pm_runtime_resume_and_get(ddev->dev);
>>> -    if (ret < 0)
>>> -    return ret;
>>> +    ret = pm_runtime_get_if_active(ddev->dev, true);
>>> +    if (ret <= 0)
>>> +    return ret ?: -EPERM;
>>>     if (amdgpu_dpm_get_pp_num_states(adev, &data))
>>>   memset(&data, 0, sizeof(data));
>>> @@ -399,9 +399,9 @@ static ssize_t amdgpu_get_pp_cur_state(struct
>>> device *dev,
>>>   if (adev->in_suspend && !adev->in_runpm)
>>>   return -EPERM;
>>>   -    ret = pm_runtime_resume_and_get(ddev->dev);
>>> -    if (ret < 0)
>>> -    return ret;
>>> +    ret = pm_runtime_get_if_active(ddev->dev, true);
>>> +    if (ret <= 0)
>>> +    return ret ?: -EPERM;
>>>     amdgpu_dpm_get_current_power_state(adev, &pm);
>>>   @@ -526,9 +526,9 @@ static ssize_t amdgpu_get_pp_table(struct
>>> device *dev,
>>>   if (adev->in_suspend && !adev->in_runpm)
>>>   return -EPERM;
>>>   -    ret = pm_runtime_resume_and_get(ddev->dev);
>>> -    if (ret < 0)
>>> -    return ret;
>>> +    ret = pm_runtime_get_if_active(ddev->dev, true);
>>> +    if (ret <= 0)
>>> +    return ret ?: -EPERM;
>>>     size = amdgpu_dpm_get_pp_table(adev, &table);
>>>   @@ -840,9 +840,9 @@ static ssize_t
>>> amdgpu_get_pp_od_clk_voltage(struct device *dev,
>>>   if (adev->in_suspend && !adev->in_runpm)
>>>   return -EPERM;
>>>   -    ret = pm_runtime_resume_and_get(ddev->dev);
>>> -    if (ret < 0)
>>> -    return ret;
>>> +    ret = pm_runtime_get_if_active(ddev->dev, true);
>>> +    if (ret <= 0)
>>> +    return ret ?: -EPERM;
>>>     for (clk_index = 0 ; clk_index < 6 ; clk_index++) {
>>>   ret = amdgpu_dpm_emit_clock_levels(adev,
>>> od_clocks[clk_index], buf, &size);
>>> @@ -930,9 +930,9 @@ static ssize_t amdgpu_get_pp_features(struct
>>> device *dev,
>>>   if (adev->in_suspend && !adev->in_runpm)
>>>   return -EPERM;
>>>   -    ret = pm_runtime_resume_and_get(ddev->dev);
>>> -    if (ret < 0)
>>> -    

RE: [PATCH 5/7] drm/amdgpu: Place NPS mode request on unload

2024-09-26 Thread Xu, Feifei
[AMD Official Use Only - AMD Internal Distribution Only]

Reviewed-by: Feifei Xu 

-Original Message-
From: amd-gfx  On Behalf Of Lijo Lazar
Sent: Tuesday, September 24, 2024 1:57 PM
To: amd-gfx@lists.freedesktop.org
Cc: Zhang, Hawking ; Deucher, Alexander 
; Koenig, Christian ; 
Bhardwaj, Rajneesh ; Errabolu, Ramesh 

Subject: [PATCH 5/7] drm/amdgpu: Place NPS mode request on unload

If a user has requested NPS mode switch, place the request through PSP during 
unload of the driver. For devices which are part of a hive, all requests are 
placed together. If one of them fails, revert back to the current NPS mode.

Signed-off-by: Lijo Lazar 
Signed-off-by: Rajneesh Bhardwaj 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c  |  1 +  
drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c  | 47   
drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h  |  2 +  
drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c | 38 +++  
drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h |  4 ++
 5 files changed, 92 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index 95331294509c..d16cdcdb2114 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -2428,6 +2428,7 @@ amdgpu_pci_remove(struct pci_dev *pdev)
struct amdgpu_device *adev = drm_to_adev(dev);

amdgpu_xcp_dev_unplug(adev);
+   amdgpu_gmc_prepare_nps_mode_change(adev);
drm_dev_unplug(dev);

if (adev->pm.rpm_mode != AMDGPU_RUNPM_NONE) { diff --git 
a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
index 24a1f931d9ed..21f1e65c9dc9 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
@@ -1345,3 +1345,50 @@ int amdgpu_gmc_request_memory_partition(struct 
amdgpu_device *adev,

return psp_memory_partition(&adev->psp, nps_mode);  }
+
+static inline bool amdgpu_gmc_need_nps_switch_req(struct amdgpu_device *adev,
+ int req_nps_mode,
+ int cur_nps_mode)
+{
+   return (((BIT(req_nps_mode) & adev->gmc.supported_nps_modes) ==
+   BIT(req_nps_mode)) &&
+   req_nps_mode != cur_nps_mode);
+}
+
+void amdgpu_gmc_prepare_nps_mode_change(struct amdgpu_device *adev) {
+   int req_nps_mode, cur_nps_mode, r;
+   struct amdgpu_hive_info *hive;
+
+   if (amdgpu_sriov_vf(adev) || !adev->gmc.supported_nps_modes ||
+   !adev->gmc.gmc_funcs->request_mem_partition_mode)
+   return;
+
+   cur_nps_mode = adev->gmc.gmc_funcs->query_mem_partition_mode(adev);
+   hive = amdgpu_get_xgmi_hive(adev);
+   if (hive) {
+   req_nps_mode = atomic_read(&hive->requested_nps_mode);
+   if (!amdgpu_gmc_need_nps_switch_req(adev, req_nps_mode,
+   cur_nps_mode)) {
+   amdgpu_put_xgmi_hive(hive);
+   return;
+   }
+   r = amdgpu_xgmi_request_nps_change(adev, hive, req_nps_mode);
+   amdgpu_put_xgmi_hive(hive);
+   goto out;
+   }
+
+   req_nps_mode = adev->gmc.requested_nps_mode;
+   if (!amdgpu_gmc_need_nps_switch_req(adev, req_nps_mode, cur_nps_mode))
+   return;
+
+   /* even if this fails, we should let driver unload w/o blocking */
+   r = adev->gmc.gmc_funcs->request_mem_partition_mode(adev,
+req_nps_mode);
+out:
+   if (r)
+   dev_err(adev->dev, "NPS mode change request failed\n");
+   else
+   dev_info(
+   adev->dev,
+   "NPS mode change request done, reload driver to 
complete the
+change\n"); }
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h
index 1a50639a003a..b13d6adb5efd 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h
@@ -467,4 +467,6 @@ int amdgpu_gmc_get_nps_memranges(struct amdgpu_device *adev,

 int amdgpu_gmc_request_memory_partition(struct amdgpu_device *adev,
int nps_mode);
+void amdgpu_gmc_prepare_nps_mode_change(struct amdgpu_device *adev);
+
 #endif
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c
index 5d721ccb9dfd..db2c1b11b813 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c
@@ -1564,3 +1564,41 @@ int amdgpu_xgmi_reset_on_init(struct amdgpu_device *adev)

return 0;
 }
+
+int amdgpu_xgmi_request_nps_change(struct amdgpu_device *adev,
+  struct amdgpu_hive_info *hive,
+  int req_nps_mode)
+{
+   struct amdgpu_device *tmp_adev;
+   int cur_nps_mode, r;
+
+   /* This is expected to be called only during unload of drive

RE: [PATCH 7/7] drm/amdgpu: Add NPS switch support for GC 9.4.3

2024-09-26 Thread Xu, Feifei
[AMD Official Use Only - AMD Internal Distribution Only]

Reviewed-by: Feifei Xu 

-Original Message-
From: amd-gfx  On Behalf Of Lijo Lazar
Sent: Tuesday, September 24, 2024 1:57 PM
To: amd-gfx@lists.freedesktop.org
Cc: Zhang, Hawking ; Deucher, Alexander 
; Koenig, Christian ; 
Bhardwaj, Rajneesh ; Errabolu, Ramesh 

Subject: [PATCH 7/7] drm/amdgpu: Add NPS switch support for GC 9.4.3

Add dynamic NPS switch support for GC 9.4.3 variants. Only GC v9.4.3 and GC 
v9.4.4 currently support this. NPS switch is only supported if an SOC supports 
multiple NPS modes.

Signed-off-by: Lijo Lazar 
Signed-off-by: Rajneesh Bhardwaj 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_nbio.h |  1 +
 drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c| 44 
 drivers/gpu/drm/amd/amdgpu/nbio_v7_9.c   | 12 +++
 3 files changed, 57 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_nbio.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_nbio.h
index f61d117b0caf..79c2f807b9fe 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_nbio.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_nbio.h
@@ -101,6 +101,7 @@ struct amdgpu_nbio_funcs {
int (*get_compute_partition_mode)(struct amdgpu_device *adev);
u32 (*get_memory_partition_mode)(struct amdgpu_device *adev,
 u32 *supp_modes);
+   bool (*is_nps_switch_requested)(struct amdgpu_device *adev);
u64 (*get_pcie_replay_count)(struct amdgpu_device *adev);
void (*set_reg_remap)(struct amdgpu_device *adev);  }; diff --git 
a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
index cafcb24449df..6a95402985ef 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
@@ -1395,6 +1395,17 @@ gmc_v9_0_query_memory_partition(struct amdgpu_device 
*adev)
return gmc_v9_0_get_memory_partition(adev, NULL);  }

+static bool gmc_v9_0_need_reset_on_init(struct amdgpu_device *adev) {
+   if (adev->nbio.funcs &&
+   adev->nbio.funcs->is_nps_switch_requested(adev)) {
+   adev->gmc.reset_flags |= AMDGPU_GMC_INIT_RESET_NPS;
+   return true;
+   }
+
+   return false;
+}
+
 static const struct amdgpu_gmc_funcs gmc_v9_0_gmc_funcs = {
.flush_gpu_tlb = gmc_v9_0_flush_gpu_tlb,
.flush_gpu_tlb_pasid = gmc_v9_0_flush_gpu_tlb_pasid, @@ -1406,6 +1417,8 
@@ static const struct amdgpu_gmc_funcs gmc_v9_0_gmc_funcs = {
.override_vm_pte_flags = gmc_v9_0_override_vm_pte_flags,
.get_vbios_fb_size = gmc_v9_0_get_vbios_fb_size,
.query_mem_partition_mode = &gmc_v9_0_query_memory_partition,
+   .request_mem_partition_mode = &amdgpu_gmc_request_memory_partition,
+   .need_reset_on_init = &gmc_v9_0_need_reset_on_init,
 };

 static void gmc_v9_0_set_gmc_funcs(struct amdgpu_device *adev) @@ -1545,6 
+1558,28 @@ static void gmc_v9_0_set_xgmi_ras_funcs(struct amdgpu_device *adev)
adev->gmc.xgmi.ras = &xgmi_ras;
 }

+static void gmc_v9_0_init_nps_details(struct amdgpu_device *adev) {
+   adev->gmc.supported_nps_modes = 0;
+
+   if (amdgpu_sriov_vf(adev) || (adev->flags & AMD_IS_APU))
+   return;
+
+   /*TODO: Check PSP version also which supports NPS switch. Otherwise keep
+* supported modes as 0.
+*/
+   switch (amdgpu_ip_version(adev, GC_HWIP, 0)) {
+   case IP_VERSION(9, 4, 3):
+   case IP_VERSION(9, 4, 4):
+   adev->gmc.supported_nps_modes =
+   BIT(AMDGPU_NPS1_PARTITION_MODE) |
+   BIT(AMDGPU_NPS4_PARTITION_MODE);
+   break;
+   default:
+   break;
+   }
+}
+
 static int gmc_v9_0_early_init(void *handle)  {
struct amdgpu_device *adev = (struct amdgpu_device *)handle; @@ -2165,6 
+2200,7 @@ static int gmc_v9_0_sw_init(void *handle)
if (r)
return r;

+   gmc_v9_0_init_nps_details(adev);
/*
 * number of VMs
 * VMID 0 is reserved for System
@@ -2440,6 +2476,14 @@ static int gmc_v9_0_resume(void *handle)
int r;
struct amdgpu_device *adev = (struct amdgpu_device *)handle;

+   /* If a reset is done for NPS mode switch, read the memory range
+* information again.
+*/
+   if (adev->gmc.reset_flags & AMDGPU_GMC_INIT_RESET_NPS) {
+   gmc_v9_0_init_sw_mem_ranges(adev, adev->gmc.mem_partitions);
+   adev->gmc.reset_flags &= ~AMDGPU_GMC_INIT_RESET_NPS;
+   }
+
r = gmc_v9_0_hw_init(adev);
if (r)
return r;
diff --git a/drivers/gpu/drm/amd/amdgpu/nbio_v7_9.c 
b/drivers/gpu/drm/amd/amdgpu/nbio_v7_9.c
index d1bd79bbae53..8a0a63ac88d2 100644
--- a/drivers/gpu/drm/amd/amdgpu/nbio_v7_9.c
+++ b/drivers/gpu/drm/amd/amdgpu/nbio_v7_9.c
@@ -401,6 +401,17 @@ static int nbio_v7_9_get_compute_partition_mode(struct 
amdgpu_device *adev)
return px;
 }

+static bool nbio_v7_9_is_nps_switch_requested(struct amdgp

RE: [PATCH 2/7] drm/amdgpu: Add PSP interface for NPS switch

2024-09-26 Thread Xu, Feifei
[AMD Official Use Only - AMD Internal Distribution Only]

Reviewed-by: Feifei Xu 

-Original Message-
From: amd-gfx  On Behalf Of Lijo Lazar
Sent: Tuesday, September 24, 2024 1:57 PM
To: amd-gfx@lists.freedesktop.org
Cc: Zhang, Hawking ; Deucher, Alexander 
; Koenig, Christian ; 
Bhardwaj, Rajneesh ; Errabolu, Ramesh 

Subject: [PATCH 2/7] drm/amdgpu: Add PSP interface for NPS switch

Implement PSP ring command interface for memory partitioning on the fly on the 
supported asics.

Signed-off-by: Rajneesh Bhardwaj 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 25 +  
drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h |  1 +  
drivers/gpu/drm/amd/amdgpu/psp_gfx_if.h | 14 +++---
 3 files changed, 37 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
index 944dad9ad29f..04be0fabb4f5 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
@@ -1045,6 +1045,31 @@ static int psp_rl_load(struct amdgpu_device *adev)
return ret;
 }

+int psp_memory_partition(struct psp_context *psp, int mode) {
+   struct psp_gfx_cmd_resp *cmd;
+   int ret;
+
+   if (amdgpu_sriov_vf(psp->adev))
+   return 0;
+
+   cmd = acquire_psp_cmd_buf(psp);
+
+   cmd->cmd_id = GFX_CMD_ID_FB_NPS_MODE;
+   cmd->cmd.cmd_memory_part.mode = mode;
+
+   dev_info(psp->adev->dev,
+"Requesting %d memory partition change through PSP", mode);
+   ret = psp_cmd_submit_buf(psp, NULL, cmd, psp->fence_buf_mc_addr);
+   if (ret)
+   dev_err(psp->adev->dev,
+   "PSP request failed to change to NPS%d mode\n", mode);
+
+   release_psp_cmd_buf(psp);
+
+   return ret;
+}
+
 int psp_spatial_partition(struct psp_context *psp, int mode)  {
struct psp_gfx_cmd_resp *cmd;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
index 76fa18ffc045..567cb1f924ca 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
@@ -553,6 +553,7 @@ int psp_load_fw_list(struct psp_context *psp,  void 
psp_copy_fw(struct psp_context *psp, uint8_t *start_addr, uint32_t bin_size);

 int psp_spatial_partition(struct psp_context *psp, int mode);
+int psp_memory_partition(struct psp_context *psp, int mode);

 int is_psp_fw_valid(struct psp_bin_desc bin);

diff --git a/drivers/gpu/drm/amd/amdgpu/psp_gfx_if.h 
b/drivers/gpu/drm/amd/amdgpu/psp_gfx_if.h
index 604301371e4f..f4a91b126c73 100644
--- a/drivers/gpu/drm/amd/amdgpu/psp_gfx_if.h
+++ b/drivers/gpu/drm/amd/amdgpu/psp_gfx_if.h
@@ -103,8 +103,10 @@ enum psp_gfx_cmd_id
 GFX_CMD_ID_AUTOLOAD_RLC   = 0x0021,   /* Indicates all graphics fw 
loaded, start RLC autoload */
 GFX_CMD_ID_BOOT_CFG   = 0x0022,   /* Boot Config */
 GFX_CMD_ID_SRIOV_SPATIAL_PART = 0x0027,   /* Configure spatial 
partitioning mode */
-   /*IDs of performance monitoring/profiling*/
-   GFX_CMD_ID_CONFIG_SQ_PERFMON = 0x0046,   /* Config CGTT_SQ_CLK_CTRL 
*/
+/*IDs of performance monitoring/profiling*/
+GFX_CMD_ID_CONFIG_SQ_PERFMON  = 0x0046,   /* Config CGTT_SQ_CLK_CTRL */
+/* Dynamic memory partitioninig (NPS mode change)*/
+GFX_CMD_ID_FB_NPS_MODE = 0x0048,  /* Configure memory 
partitioning mode */
 };

 /* PSP boot config sub-commands */
@@ -362,6 +364,11 @@ struct psp_gfx_cmd_config_sq_perfmon {
uint8_t reserved[5];
 };

+struct psp_gfx_cmd_fb_memory_part {
+   uint32_t mode; /* requested NPS mode */
+   uint32_t resvd;
+};
+
 /* All GFX ring buffer commands. */
 union psp_gfx_commands
 {
@@ -376,7 +383,8 @@ union psp_gfx_commands
 struct psp_gfx_cmd_load_toc cmd_load_toc;
 struct psp_gfx_cmd_boot_cfg boot_cfg;
 struct psp_gfx_cmd_sriov_spatial_part cmd_spatial_part;
-   struct psp_gfx_cmd_config_sq_perfmon config_sq_perfmon;
+struct psp_gfx_cmd_config_sq_perfmon config_sq_perfmon;
+struct psp_gfx_cmd_fb_memory_part cmd_memory_part;
 };

 struct psp_gfx_uresp_reserved
--
2.25.1



RE: [PATCH 3/7] drm/amdgpu: Add gmc interface to request NPS mode

2024-09-26 Thread Xu, Feifei
[AMD Official Use Only - AMD Internal Distribution Only]

Reviewed-by: Feifei Xu 

-Original Message-
From: amd-gfx  On Behalf Of Lijo Lazar
Sent: Tuesday, September 24, 2024 1:57 PM
To: amd-gfx@lists.freedesktop.org
Cc: Zhang, Hawking ; Deucher, Alexander 
; Koenig, Christian ; 
Bhardwaj, Rajneesh ; Errabolu, Ramesh 

Subject: [PATCH 3/7] drm/amdgpu: Add gmc interface to request NPS mode

Add a common interface in GMC to request NPS mode through PSP. Also add a 
variable in hive and gmc control to track the last requested mode.

Signed-off-by: Rajneesh Bhardwaj 
Signed-off-by: Lijo Lazar 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c  | 16   
drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h  |  6 ++  
drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c |  1 +  
drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h |  1 +
 4 files changed, 24 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
index 4f088a5368d8..758fda4e628f 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
@@ -1247,3 +1247,19 @@ int amdgpu_gmc_get_nps_memranges(struct amdgpu_device 
*adev,

return ret;
 }
+
+int amdgpu_gmc_request_memory_partition(struct amdgpu_device *adev,
+   int nps_mode)
+{
+   /* Not supported on VF devices and APUs */
+   if (amdgpu_sriov_vf(adev) || (adev->flags & AMD_IS_APU))
+   return -EOPNOTSUPP;
+
+   if (!adev->psp.funcs) {
+   dev_err(adev->dev,
+   "PSP interface not available for nps mode change 
request");
+   return -EINVAL;
+   }
+
+   return psp_memory_partition(&adev->psp, nps_mode); }
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h
index 33b2adffd58b..f5be5112b742 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h
@@ -161,6 +161,9 @@ struct amdgpu_gmc_funcs {

enum amdgpu_memory_partition (*query_mem_partition_mode)(
struct amdgpu_device *adev);
+   /* Request NPS mode */
+   int (*request_mem_partition_mode)(struct amdgpu_device *adev,
+ int nps_mode);
 };

 struct amdgpu_xgmi_ras {
@@ -304,6 +307,7 @@ struct amdgpu_gmc {
struct amdgpu_mem_partition_info *mem_partitions;
uint8_t num_mem_partitions;
const struct amdgpu_gmc_funcs   *gmc_funcs;
+   enum amdgpu_memory_partitionrequested_nps_mode;

struct amdgpu_xgmi xgmi;
struct amdgpu_irq_src   ecc_irq;
@@ -455,4 +459,6 @@ int amdgpu_gmc_get_nps_memranges(struct amdgpu_device *adev,
 struct amdgpu_mem_partition_info *mem_ranges,
 int exp_ranges);

+int amdgpu_gmc_request_memory_partition(struct amdgpu_device *adev,
+   int nps_mode);
 #endif
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c
index b17e63c98a99..5d721ccb9dfd 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c
@@ -667,6 +667,7 @@ struct amdgpu_hive_info *amdgpu_get_xgmi_hive(struct 
amdgpu_device *adev)
task_barrier_init(&hive->tb);
hive->pstate = AMDGPU_XGMI_PSTATE_UNKNOWN;
hive->hi_req_gpu = NULL;
+   atomic_set(&hive->requested_nps_mode, UNKNOWN_MEMORY_PARTITION_MODE);

/*
 * hive pstate on boot is high in vega20 so we have to go to low diff 
--git a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h
index d652727ca565..67abadb4f298 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h
@@ -46,6 +46,7 @@ struct amdgpu_hive_info {
atomic_t ras_recovery;
struct ras_event_manager event_mgr;
struct work_struct reset_on_init_work;
+   atomic_t requested_nps_mode;
 };

 struct amdgpu_pcs_ras_field {
--
2.25.1



Re: [PATCH v2] drm/sched: Further optimise drm_sched_entity_push_job

2024-09-26 Thread Christian König

Am 26.09.24 um 10:18 schrieb Philipp Stanner:

On Tue, 2024-09-24 at 14:02 +0200, Christian König wrote:

Am 24.09.24 um 11:58 schrieb Tvrtko Ursulin:

On 24/09/2024 10:45, Tvrtko Ursulin wrote:

On 24/09/2024 09:20, Christian König wrote:

Am 16.09.24 um 19:30 schrieb Tvrtko Ursulin:

From: Tvrtko Ursulin 

Having removed one re-lock cycle on the entity->lock in a
patch titled
"drm/sched: Optimise drm_sched_entity_push_job", with only a
tiny bit
larger refactoring we can do the same optimisation on the rq-

lock.

(Currently both drm_sched_rq_add_entity() and
drm_sched_rq_update_fifo_locked() take and release the same
lock.)

To achieve this we make drm_sched_rq_update_fifo_locked() and
drm_sched_rq_add_entity() expect the rq->lock to be held.

We also align drm_sched_rq_update_fifo_locked(),
drm_sched_rq_add_entity() and
drm_sched_rq_remove_fifo_locked() function signatures, by
adding rq
as a
parameter to the latter.

v2:
   * Fix after rebase of the series.
   * Avoid naming incosistency between
drm_sched_rq_add/remove.
(Christian)

Signed-off-by: Tvrtko Ursulin 
Cc: Christian König 
Cc: Alex Deucher 
Cc: Luben Tuikov 
Cc: Matthew Brost 
Cc: Philipp Stanner 

Reviewed-by: Christian König 

Thanks!

Are you okay to pull into drm-misc-next or we should do some more
testing on this?

And/or should I resend the series once more in it's entirety so
this
v2 is not a reply-to to the original?

I have to respin for the drm_sched_wakeup fix that landed.

When I should push the series to drm-misc-next then please send it to
me
once more.

On the other hand we should now have to maintainers for that.

Yup, will pick up this responsibilities soonish. Danilo and I have been
on conference recently and I'm out of office soon for a bit, but you
can expect me / us to take over that work soonish in early autumn.


But please don't push this set. I have messed up the push because I 
didn't realized that the first three patches were supposed to land in 
-fixes instead of -next.


Should be cleaned up my me by the end of the day.

Sorry for the noise,
Christian.



Regards,
P.


Regards,
Christian.


Regards,

Tvrtko


Regards,

Tvrtko


---
   drivers/gpu/drm/scheduler/sched_entity.c | 12 --
   drivers/gpu/drm/scheduler/sched_main.c   | 29

   include/drm/gpu_scheduler.h  |  3 ++-
   3 files changed, 26 insertions(+), 18 deletions(-)

diff --git a/drivers/gpu/drm/scheduler/sched_entity.c
b/drivers/gpu/drm/scheduler/sched_entity.c
index d982cebc6bee..8ace1f1ea66b 100644
--- a/drivers/gpu/drm/scheduler/sched_entity.c
+++ b/drivers/gpu/drm/scheduler/sched_entity.c
@@ -515,9 +515,14 @@ struct drm_sched_job
*drm_sched_entity_pop_job(struct drm_sched_entity *entity)
   next =
to_drm_sched_job(spsc_queue_peek(&entity->job_queue));
   if (next) {
+    struct drm_sched_rq *rq;
+
   spin_lock(&entity->lock);
-    drm_sched_rq_update_fifo_locked(entity,
+    rq = entity->rq;
+    spin_lock(&rq->lock);
+    drm_sched_rq_update_fifo_locked(entity, rq,
   next->submit_ts);
+    spin_unlock(&rq->lock);
   spin_unlock(&entity->lock);
   }
   }
@@ -618,11 +623,14 @@ void drm_sched_entity_push_job(struct
drm_sched_job *sched_job)
   sched = rq->sched;
   atomic_inc(sched->score);
+
+    spin_lock(&rq->lock);
   drm_sched_rq_add_entity(rq, entity);
   if (drm_sched_policy == DRM_SCHED_POLICY_FIFO)
-    drm_sched_rq_update_fifo_locked(entity,
submit_ts);
+    drm_sched_rq_update_fifo_locked(entity, rq,
submit_ts);
+    spin_unlock(&rq->lock);
   spin_unlock(&entity->lock);
   drm_sched_wakeup(sched, entity);
diff --git a/drivers/gpu/drm/scheduler/sched_main.c
b/drivers/gpu/drm/scheduler/sched_main.c
index 18a952f73ecb..5c83fb92bb89 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -153,17 +153,18 @@ static __always_inline bool
drm_sched_entity_compare_before(struct rb_node *a,
   return ktime_before(ent_a->oldest_job_waiting,
ent_b->oldest_job_waiting);
   }
-static inline void drm_sched_rq_remove_fifo_locked(struct
drm_sched_entity *entity)
+static void drm_sched_rq_remove_fifo_locked(struct
drm_sched_entity *entity,
+    struct drm_sched_rq *rq)
   {
-    struct drm_sched_rq *rq = entity->rq;
-
   if (!RB_EMPTY_NODE(&entity->rb_tree_node)) {
   rb_erase_cached(&entity->rb_tree_node, &rq-

rb_tree_root);

   RB_CLEAR_NODE(&entity->rb_tree_node);
   }
   }
-void drm_sched_rq_update_fifo_locked(struct drm_sched_entity
*entity, ktime_t ts)
+void drm_sched_rq_update_fifo_locked(struct drm_sched_entity
*entity,
+ struct drm_sched_rq *rq,
+ ktime_t ts)
   {
   /*
    * Both locks need to be grabbed, one to protect from
entity->rq change
@@ -171,17 +172,14 @@ void

Re: [PATCH v1 4/9] drm/amd/pm: don't update runpm last_usage on debugfs getter

2024-09-26 Thread Pierre-Eric Pelloux-Prayer




Le 26/09/2024 à 10:55, Lazar, Lijo a écrit :



On 9/25/2024 1:24 PM, Pierre-Eric Pelloux-Prayer wrote:

Reading pm values from the GPU shouldn't prevent it to be suspended
by resetting the last active timestamp (eg: if an background app
monitors GPU sensors every second, it would prevent the autosuspend
sequence to trigger).

Tested-by: Mario Limonciello 
Signed-off-by: Pierre-Eric Pelloux-Prayer 
---
  drivers/gpu/drm/amd/pm/amdgpu_pm.c | 26 --
  1 file changed, 26 deletions(-)

diff --git a/drivers/gpu/drm/amd/pm/amdgpu_pm.c 
b/drivers/gpu/drm/amd/pm/amdgpu_pm.c
index 042a4dd1bd6a..c8f34b1a4d8e 100644
--- a/drivers/gpu/drm/amd/pm/amdgpu_pm.c
+++ b/drivers/gpu/drm/amd/pm/amdgpu_pm.c
@@ -151,7 +151,6 @@ static ssize_t amdgpu_get_power_dpm_state(struct device 
*dev,
  
  	amdgpu_dpm_get_current_power_state(adev, &pm);
  
-	pm_runtime_mark_last_busy(ddev->dev);

pm_runtime_put_autosuspend(ddev->dev);
  
  	return sysfs_emit(buf, "%s\n",

@@ -275,7 +274,6 @@ static ssize_t 
amdgpu_get_power_dpm_force_performance_level(struct device *dev,
  
  	level = amdgpu_dpm_get_performance_level(adev);
  
-	pm_runtime_mark_last_busy(ddev->dev);

pm_runtime_put_autosuspend(ddev->dev);
  
  	return sysfs_emit(buf, "%s\n",

@@ -373,7 +371,6 @@ static ssize_t amdgpu_get_pp_num_states(struct device *dev,
if (amdgpu_dpm_get_pp_num_states(adev, &data))
memset(&data, 0, sizeof(data));
  
-	pm_runtime_mark_last_busy(ddev->dev);

pm_runtime_put_autosuspend(ddev->dev);
  
  	buf_len = sysfs_emit(buf, "states: %d\n", data.nums);

@@ -410,7 +407,6 @@ static ssize_t amdgpu_get_pp_cur_state(struct device *dev,
  
  	ret = amdgpu_dpm_get_pp_num_states(adev, &data);
  
-	pm_runtime_mark_last_busy(ddev->dev);

pm_runtime_put_autosuspend(ddev->dev);
  
  	if (ret)

@@ -536,7 +532,6 @@ static ssize_t amdgpu_get_pp_table(struct device *dev,
  
  	size = amdgpu_dpm_get_pp_table(adev, &table);
  
-	pm_runtime_mark_last_busy(ddev->dev);

pm_runtime_put_autosuspend(ddev->dev);
  
  	if (size <= 0)

@@ -866,7 +861,6 @@ static ssize_t amdgpu_get_pp_od_clk_voltage(struct device 
*dev,
if (size == 0)
size = sysfs_emit(buf, "\n");
  
-	pm_runtime_mark_last_busy(ddev->dev);

pm_runtime_put_autosuspend(ddev->dev);
  
  	return size;

@@ -944,7 +938,6 @@ static ssize_t amdgpu_get_pp_features(struct device *dev,
if (size <= 0)
size = sysfs_emit(buf, "\n");
  
-	pm_runtime_mark_last_busy(ddev->dev);

pm_runtime_put_autosuspend(ddev->dev);
  
  	return size;

@@ -1014,7 +1007,6 @@ static ssize_t amdgpu_get_pp_dpm_clock(struct device *dev,
if (size == 0)
size = sysfs_emit(buf, "\n");
  
-	pm_runtime_mark_last_busy(ddev->dev);

pm_runtime_put_autosuspend(ddev->dev);
  
  	return size;

@@ -1259,7 +1251,6 @@ static ssize_t amdgpu_get_pp_sclk_od(struct device *dev,
  
  	value = amdgpu_dpm_get_sclk_od(adev);
  
-	pm_runtime_mark_last_busy(ddev->dev);

pm_runtime_put_autosuspend(ddev->dev);
  
  	return sysfs_emit(buf, "%d\n", value);

@@ -1317,7 +1308,6 @@ static ssize_t amdgpu_get_pp_mclk_od(struct device *dev,
  
  	value = amdgpu_dpm_get_mclk_od(adev);
  
-	pm_runtime_mark_last_busy(ddev->dev);

pm_runtime_put_autosuspend(ddev->dev);
  
  	return sysfs_emit(buf, "%d\n", value);

@@ -1397,7 +1387,6 @@ static ssize_t amdgpu_get_pp_power_profile_mode(struct 
device *dev,
if (size <= 0)
size = sysfs_emit(buf, "\n");
  
-	pm_runtime_mark_last_busy(ddev->dev);

pm_runtime_put_autosuspend(ddev->dev);
  
  	return size;

@@ -1485,7 +1474,6 @@ static int amdgpu_hwmon_get_sensor_generic(struct 
amdgpu_device *adev,
/* get the sensor value */
r = amdgpu_dpm_read_sensor(adev, sensor, query, &size);
  
-	pm_runtime_mark_last_busy(adev_to_drm(adev)->dev);

pm_runtime_put_autosuspend(adev_to_drm(adev)->dev);
  
  	return r;

@@ -1601,7 +1589,6 @@ static ssize_t amdgpu_get_pcie_bw(struct device *dev,
  
  	amdgpu_asic_get_pcie_usage(adev, &count0, &count1);
  
-	pm_runtime_mark_last_busy(ddev->dev);

pm_runtime_put_autosuspend(ddev->dev);
  
  	return sysfs_emit(buf, "%llu %llu %i\n",

@@ -1734,7 +1721,6 @@ static ssize_t amdgpu_get_apu_thermal_cap(struct device 
*dev,
else
size = sysfs_emit(buf, "failed to get thermal limit\n");
  
-	pm_runtime_mark_last_busy(ddev->dev);

pm_runtime_put_autosuspend(ddev->dev);
  
  	return size;

@@ -1807,7 +1793,6 @@ static ssize_t amdgpu_get_pm_metrics(struct device *dev,
  
  	size = amdgpu_dpm_get_pm_metrics(adev, buf, PAGE_SIZE);
  
-	pm_runtime_mark_last_busy(ddev->dev);

pm_runtime_put_autosuspend(ddev->dev);
  
  	return size;

@@ -1854,7 +1839,6 @@ static ssize_t amdgpu_get_gpu_metrics(struct device *dev,
memcpy(buf, gpu_metrics, size);
  
  out:

-   pm_runtime_mark_last_busy(ddev->dev);
pm_runtime_put_aut

Re: [PATCH v1 5/9] drm/amd/pm: use pm_runtime_get_if_active for debugfs getters

2024-09-26 Thread Lazar, Lijo



On 9/25/2024 1:24 PM, Pierre-Eric Pelloux-Prayer wrote:
> Don't wake up the GPU for reading pm values. Instead, take a runtime
> powermanagement ref when trying to read it iff the GPU is already
> awake.
> 
> This avoids spurious wake ups (eg: from applets).
> 
> We use pm_runtime_get_if_in_active(ign_usage_count=true) because we care
> about "is the GPU awake?" not about "is the GPU awake and something else
> prevents suspend?".

One possible downside of this approach is - let's say an application
tries this way on a BACO enabled device (device is visible on bus, but
powered off due to runtime PM)

Get current clock level
If (success) && (not desired clock level)
Set clock level
Submit some jobs to run at set clock level

This type of approach won't work since get clock level() itself will
fail. That said, I don't know if there is any app out there does
something like this.

Thanks,
Lijo

> 
> Tested-by: Mario Limonciello 
> Signed-off-by: Pierre-Eric Pelloux-Prayer 
> ---
>  drivers/gpu/drm/amd/pm/amdgpu_pm.c | 138 ++---
>  1 file changed, 69 insertions(+), 69 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/pm/amdgpu_pm.c 
> b/drivers/gpu/drm/amd/pm/amdgpu_pm.c
> index c8f34b1a4d8e..f1f339b75380 100644
> --- a/drivers/gpu/drm/amd/pm/amdgpu_pm.c
> +++ b/drivers/gpu/drm/amd/pm/amdgpu_pm.c
> @@ -145,9 +145,9 @@ static ssize_t amdgpu_get_power_dpm_state(struct device 
> *dev,
>   if (adev->in_suspend && !adev->in_runpm)
>   return -EPERM;
>  
> - ret = pm_runtime_resume_and_get(ddev->dev);
> - if (ret < 0)
> - return ret;
> + ret = pm_runtime_get_if_active(ddev->dev, true);
> + if (ret <= 0)
> + return ret ?: -EPERM;
>  
>   amdgpu_dpm_get_current_power_state(adev, &pm);
>  
> @@ -268,9 +268,9 @@ static ssize_t 
> amdgpu_get_power_dpm_force_performance_level(struct device *dev,
>   if (adev->in_suspend && !adev->in_runpm)
>   return -EPERM;
>  
> - ret = pm_runtime_resume_and_get(ddev->dev);
> - if (ret < 0)
> - return ret;
> + ret = pm_runtime_get_if_active(ddev->dev, true);
> + if (ret <= 0)
> + return ret ?: -EPERM;
>  
>   level = amdgpu_dpm_get_performance_level(adev);
>  
> @@ -364,9 +364,9 @@ static ssize_t amdgpu_get_pp_num_states(struct device 
> *dev,
>   if (adev->in_suspend && !adev->in_runpm)
>   return -EPERM;
>  
> - ret = pm_runtime_resume_and_get(ddev->dev);
> - if (ret < 0)
> - return ret;
> + ret = pm_runtime_get_if_active(ddev->dev, true);
> + if (ret <= 0)
> + return ret ?: -EPERM;
>  
>   if (amdgpu_dpm_get_pp_num_states(adev, &data))
>   memset(&data, 0, sizeof(data));
> @@ -399,9 +399,9 @@ static ssize_t amdgpu_get_pp_cur_state(struct device *dev,
>   if (adev->in_suspend && !adev->in_runpm)
>   return -EPERM;
>  
> - ret = pm_runtime_resume_and_get(ddev->dev);
> - if (ret < 0)
> - return ret;
> + ret = pm_runtime_get_if_active(ddev->dev, true);
> + if (ret <= 0)
> + return ret ?: -EPERM;
>  
>   amdgpu_dpm_get_current_power_state(adev, &pm);
>  
> @@ -526,9 +526,9 @@ static ssize_t amdgpu_get_pp_table(struct device *dev,
>   if (adev->in_suspend && !adev->in_runpm)
>   return -EPERM;
>  
> - ret = pm_runtime_resume_and_get(ddev->dev);
> - if (ret < 0)
> - return ret;
> + ret = pm_runtime_get_if_active(ddev->dev, true);
> + if (ret <= 0)
> + return ret ?: -EPERM;
>  
>   size = amdgpu_dpm_get_pp_table(adev, &table);
>  
> @@ -840,9 +840,9 @@ static ssize_t amdgpu_get_pp_od_clk_voltage(struct device 
> *dev,
>   if (adev->in_suspend && !adev->in_runpm)
>   return -EPERM;
>  
> - ret = pm_runtime_resume_and_get(ddev->dev);
> - if (ret < 0)
> - return ret;
> + ret = pm_runtime_get_if_active(ddev->dev, true);
> + if (ret <= 0)
> + return ret ?: -EPERM;
>  
>   for (clk_index = 0 ; clk_index < 6 ; clk_index++) {
>   ret = amdgpu_dpm_emit_clock_levels(adev, od_clocks[clk_index], 
> buf, &size);
> @@ -930,9 +930,9 @@ static ssize_t amdgpu_get_pp_features(struct device *dev,
>   if (adev->in_suspend && !adev->in_runpm)
>   return -EPERM;
>  
> - ret = pm_runtime_resume_and_get(ddev->dev);
> - if (ret < 0)
> - return ret;
> + ret = pm_runtime_get_if_active(ddev->dev, true);
> + if (ret <= 0)
> + return ret ?: -EPERM;
>  
>   size = amdgpu_dpm_get_ppfeature_status(adev, buf);
>   if (size <= 0)
> @@ -996,9 +996,9 @@ static ssize_t amdgpu_get_pp_dpm_clock(struct device *dev,
>   if (adev->in_suspend && !adev->in_runpm)
>   return -EPERM;
>  
> - ret = pm_runtime_resume_and_get(ddev->dev);
> - if (ret < 0)
> - return ret;
> + ret = pm_runtim

Re: [PATCH v2 01/08] drm/amdgpu: Implement userqueue signal/wait IOCTL

2024-09-26 Thread Christian König

Am 25.09.24 um 21:59 schrieb Arunpravin Paneer Selvam:

[SNIP]
+int amdgpu_userq_wait_ioctl(struct drm_device *dev, void *data,
+   struct drm_file *filp)
+{
+   struct drm_amdgpu_userq_fence_info *fence_info = NULL;
+   struct drm_amdgpu_userq_wait *wait_info = data;
+   u32 *syncobj_handles, *bo_handles;
+   struct dma_fence **fences = NULL;
+   u32 num_syncobj, num_bo_handles;
+   struct drm_gem_object **gobj;
+   struct drm_exec exec;
+   int r, i, entry, cnt;
+   u64 num_fences = 0;
+
+   num_bo_handles = wait_info->num_bo_handles;
+   bo_handles = memdup_user(u64_to_user_ptr(wait_info->bo_handles_array),
+sizeof(u32) * num_bo_handles);
+   if (IS_ERR(bo_handles))
+   return PTR_ERR(bo_handles);
+
+   num_syncobj = wait_info->num_syncobj_handles;
+   syncobj_handles = 
memdup_user(u64_to_user_ptr(wait_info->syncobj_handles_array),
+ sizeof(u32) * num_syncobj);
+   if (IS_ERR(syncobj_handles)) {
+   r = PTR_ERR(syncobj_handles);
+   goto free_bo_handles;
+   }
+
+   /* Array of GEM object handles */
+   gobj = kmalloc_array(num_bo_handles, sizeof(*gobj), GFP_KERNEL);
+   if (!gobj) {
+   r = -ENOMEM;
+   goto free_syncobj_handles;
+   }
+
+   for (entry = 0; entry < num_bo_handles; entry++) {
+   gobj[entry] = drm_gem_object_lookup(filp, bo_handles[entry]);
+   if (!gobj[entry]) {
+   r = -ENOENT;
+   goto put_gobj;
+   }
+   }
+
+   drm_exec_init(&exec, DRM_EXEC_INTERRUPTIBLE_WAIT, 0);
+   drm_exec_until_all_locked(&exec) {
+   r = drm_exec_prepare_array(&exec, gobj, num_bo_handles, 0);
+   drm_exec_retry_on_contention(&exec);
+   if (r) {
+   drm_exec_fini(&exec);
+   goto put_gobj;
+   }
+   }
+
+   if (!wait_info->num_fences) {
+   /* Count syncobj's fence */
+   for (i = 0; i < num_syncobj; i++) {
+   struct dma_fence *fence;
+
+   r = drm_syncobj_find_fence(filp, syncobj_handles[i],
+  0, 0, &fence);
+   dma_fence_put(fence);
+
+   if (r || !fence)
+   continue;
+
+   num_fences++;
+   }
+
+   /* Count GEM objects fence */
+   for (i = 0; i < num_bo_handles; i++) {
+   struct dma_resv_iter resv_cursor;
+   struct dma_fence *fence;
+
+   dma_resv_for_each_fence(&resv_cursor, gobj[i]->resv,
+   
dma_resv_usage_rw(wait_info->bo_wait_flags &
+   AMDGPU_USERQ_BO_WRITE), fence)
+   num_fences++;


We should probably adjust the UAPI here once more.

The problem is that we only provide the AMDGPU_USERQ_BO_WRITE for the 
whole IOCTL instead of per BO.


So the best approach would probably be to drop the AMDGPU_USERQ_BO_WRITE 
flag and split up the array of BOs into readers and writers.


Can you work on that Arun? Shouldn't be more than a bit typing exercise.

Thanks,
Christian.



Re: [PATCH v2 01/08] drm/amdgpu: Implement userqueue signal/wait IOCTL

2024-09-26 Thread Paneer Selvam, Arunpravin

Hi Christian,

On 9/26/2024 2:57 PM, Christian König wrote:

Am 25.09.24 um 21:59 schrieb Arunpravin Paneer Selvam:

[SNIP]
+int amdgpu_userq_wait_ioctl(struct drm_device *dev, void *data,
+    struct drm_file *filp)
+{
+    struct drm_amdgpu_userq_fence_info *fence_info = NULL;
+    struct drm_amdgpu_userq_wait *wait_info = data;
+    u32 *syncobj_handles, *bo_handles;
+    struct dma_fence **fences = NULL;
+    u32 num_syncobj, num_bo_handles;
+    struct drm_gem_object **gobj;
+    struct drm_exec exec;
+    int r, i, entry, cnt;
+    u64 num_fences = 0;
+
+    num_bo_handles = wait_info->num_bo_handles;
+    bo_handles = 
memdup_user(u64_to_user_ptr(wait_info->bo_handles_array),

+ sizeof(u32) * num_bo_handles);
+    if (IS_ERR(bo_handles))
+    return PTR_ERR(bo_handles);
+
+    num_syncobj = wait_info->num_syncobj_handles;
+    syncobj_handles = 
memdup_user(u64_to_user_ptr(wait_info->syncobj_handles_array),

+  sizeof(u32) * num_syncobj);
+    if (IS_ERR(syncobj_handles)) {
+    r = PTR_ERR(syncobj_handles);
+    goto free_bo_handles;
+    }
+
+    /* Array of GEM object handles */
+    gobj = kmalloc_array(num_bo_handles, sizeof(*gobj), GFP_KERNEL);
+    if (!gobj) {
+    r = -ENOMEM;
+    goto free_syncobj_handles;
+    }
+
+    for (entry = 0; entry < num_bo_handles; entry++) {
+    gobj[entry] = drm_gem_object_lookup(filp, bo_handles[entry]);
+    if (!gobj[entry]) {
+    r = -ENOENT;
+    goto put_gobj;
+    }
+    }
+
+    drm_exec_init(&exec, DRM_EXEC_INTERRUPTIBLE_WAIT, 0);
+    drm_exec_until_all_locked(&exec) {
+    r = drm_exec_prepare_array(&exec, gobj, num_bo_handles, 0);
+    drm_exec_retry_on_contention(&exec);
+    if (r) {
+    drm_exec_fini(&exec);
+    goto put_gobj;
+    }
+    }
+
+    if (!wait_info->num_fences) {
+    /* Count syncobj's fence */
+    for (i = 0; i < num_syncobj; i++) {
+    struct dma_fence *fence;
+
+    r = drm_syncobj_find_fence(filp, syncobj_handles[i],
+   0, 0, &fence);
+    dma_fence_put(fence);
+
+    if (r || !fence)
+    continue;
+
+    num_fences++;
+    }
+
+    /* Count GEM objects fence */
+    for (i = 0; i < num_bo_handles; i++) {
+    struct dma_resv_iter resv_cursor;
+    struct dma_fence *fence;
+
+    dma_resv_for_each_fence(&resv_cursor, gobj[i]->resv,
+ dma_resv_usage_rw(wait_info->bo_wait_flags &
+    AMDGPU_USERQ_BO_WRITE), fence)
+    num_fences++;


We should probably adjust the UAPI here once more.

The problem is that we only provide the AMDGPU_USERQ_BO_WRITE for the 
whole IOCTL instead of per BO.


So the best approach would probably be to drop the 
AMDGPU_USERQ_BO_WRITE flag and split up the array of BOs into readers 
and writers.


Can you work on that Arun? Shouldn't be more than a bit typing exercise.

Sure, I will modify and send the next version of this file.

Thanks,
Arun.


Thanks,
Christian.





Re: [PATCH 6/7] drm/amdgpu: Check gmc requirement for reset on init

2024-09-26 Thread Lazar, Lijo



On 9/26/2024 2:31 PM, Xu, Feifei wrote:
> [AMD Official Use Only - AMD Internal Distribution Only]
> 
>>> + refresh = (adev->init_lvl->level != AMDGPU_INIT_LEVEL_MINIMAL_XGMI) &&
>>> +   (adev->gmc.reset_flags & AMDGPU_GMC_INIT_RESET_NPS);
> 
> Is there a corner case that reloading with a different version tos and 
> refreshing nps change co-exist?
> 

I guess you are referring to the below corner case
1) Place NPS request
2) Unload Driver
3) Reinstall driver with a different TOS (possible but quite unlikely)
4) Driver reload
5) Driver checks TOS version first and goes for a reset
6) reset_flag of GMC is not set, hence it doesn't refresh the NPS range.


I think changing the order in soc15_need_reset_on_init() to check for
NPS request before TOS version check will solve this.

Thanks,
Lijo

> Thanks,
> Feifei
> 
> -Original Message-
> From: amd-gfx  On Behalf Of Lijo Lazar
> Sent: Tuesday, September 24, 2024 1:57 PM
> To: amd-gfx@lists.freedesktop.org
> Cc: Zhang, Hawking ; Deucher, Alexander 
> ; Koenig, Christian ; 
> Bhardwaj, Rajneesh ; Errabolu, Ramesh 
> 
> Subject: [PATCH 6/7] drm/amdgpu: Check gmc requirement for reset on init
> 
> Add a callback to check if there is any condition detected by GMC block for 
> reset on init. One case is if a pending NPS change request is detected. If 
> reset is done because of NPS switch, refresh NPS info from discovery table.
> 
> Signed-off-by: Lijo Lazar 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 13 -  
> drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h |  5 +
>  drivers/gpu/drm/amd/amdgpu/soc15.c  |  2 ++
>  3 files changed, 19 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
> index 21f1e65c9dc9..011fe3a847d0 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
> @@ -1261,12 +1261,15 @@ int amdgpu_gmc_get_nps_memranges(struct amdgpu_device 
> *adev,
> struct amdgpu_gmc_memrange *ranges;
> int range_cnt, ret, i, j;
> uint32_t nps_type;
> +   bool refresh;
> 
> if (!mem_ranges)
> return -EINVAL;
> 
> +   refresh = (adev->init_lvl->level != AMDGPU_INIT_LEVEL_MINIMAL_XGMI) &&
> + (adev->gmc.reset_flags & AMDGPU_GMC_INIT_RESET_NPS);
> ret = amdgpu_discovery_get_nps_info(adev, &nps_type, &ranges,
> -   &range_cnt, false);
> +   &range_cnt, refresh);
> 
> if (ret)
> return ret;
> @@ -1392,3 +1395,11 @@ void amdgpu_gmc_prepare_nps_mode_change(struct 
> amdgpu_device *adev)
> adev->dev,
> "NPS mode change request done, reload driver to 
> complete the change\n");  }
> +
> +bool amdgpu_gmc_need_reset_on_init(struct amdgpu_device *adev) {
> +   if (adev->gmc.gmc_funcs->need_reset_on_init)
> +   return adev->gmc.gmc_funcs->need_reset_on_init(adev);
> +
> +   return false;
> +}
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h
> index b13d6adb5efd..d4cd247fe574 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h
> @@ -78,6 +78,8 @@ enum amdgpu_memory_partition {
>  BIT(AMDGPU_NPS3_PARTITION_MODE) | BIT(AMDGPU_NPS4_PARTITION_MODE) | \
>  BIT(AMDGPU_NPS6_PARTITION_MODE) | BIT(AMDGPU_NPS8_PARTITION_MODE))
> 
> +#define AMDGPU_GMC_INIT_RESET_NPS  BIT(0)
> +
>  /*
>   * GMC page fault information
>   */
> @@ -169,6 +171,7 @@ struct amdgpu_gmc_funcs {
> /* Request NPS mode */
> int (*request_mem_partition_mode)(struct amdgpu_device *adev,
>   int nps_mode);
> +   bool (*need_reset_on_init)(struct amdgpu_device *adev);
>  };
> 
>  struct amdgpu_xgmi_ras {
> @@ -314,6 +317,7 @@ struct amdgpu_gmc {
> const struct amdgpu_gmc_funcs   *gmc_funcs;
> enum amdgpu_memory_partitionrequested_nps_mode;
> uint32_t supported_nps_modes;
> +   uint32_t reset_flags;
> 
> struct amdgpu_xgmi xgmi;
> struct amdgpu_irq_src   ecc_irq;
> @@ -468,5 +472,6 @@ int amdgpu_gmc_get_nps_memranges(struct amdgpu_device 
> *adev,  int amdgpu_gmc_request_memory_partition(struct amdgpu_device *adev,
> int nps_mode);
>  void amdgpu_gmc_prepare_nps_mode_change(struct amdgpu_device *adev);
> +bool amdgpu_gmc_need_reset_on_init(struct amdgpu_device *adev);
> 
>  #endif
> diff --git a/drivers/gpu/drm/amd/amdgpu/soc15.c 
> b/drivers/gpu/drm/amd/amdgpu/soc15.c
> index 619933f252aa..97ca4931a7ef 100644
> --- a/drivers/gpu/drm/amd/amdgpu/soc15.c
> +++ b/drivers/gpu/drm/amd/amdgpu/soc15.c
> @@ -833,6 +833,8 @@ static bool soc15_need_reset_on_init(struct amdgpu_device 
> *adev)
> 
> if 

Re: [PATCH v1 6/9] drm/amd/pm: stop extra checks for runtime pm state

2024-09-26 Thread Lazar, Lijo



On 9/25/2024 7:59 PM, Pierre-Eric Pelloux-Prayer wrote:
> 
> 
> Le 25/09/2024 à 15:37, Lazar, Lijo a écrit :
>>
>>
>> On 9/25/2024 1:24 PM, Pierre-Eric Pelloux-Prayer wrote:
>>> pm_runtime_get_if_in_use already checks if the GPU is active,
>>> so there's no need for manually checking runtimepm status:
>>>
>>>     if (adev->in_suspend && !adev->in_runpm)
>>>    return -EPERM;
>>>
>>> Tested-by: Mario Limonciello 
>>> Signed-off-by: Pierre-Eric Pelloux-Prayer
>>> 
>>> ---
>>>   drivers/gpu/drm/amd/pm/amdgpu_pm.c | 46 --
>>>   1 file changed, 46 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/pm/amdgpu_pm.c
>>> b/drivers/gpu/drm/amd/pm/amdgpu_pm.c
>>> index f1f339b75380..13be5e017a01 100644
>>> --- a/drivers/gpu/drm/amd/pm/amdgpu_pm.c
>>> +++ b/drivers/gpu/drm/amd/pm/amdgpu_pm.c
>>> @@ -142,8 +142,6 @@ static ssize_t amdgpu_get_power_dpm_state(struct
>>> device *dev,
>>>     if (amdgpu_in_reset(adev))
>>>   return -EPERM;
>>> -    if (adev->in_suspend && !adev->in_runpm)
>>> -    return -EPERM;
>>>   
>>
>> I believe this check is for accesses before the device is fully resumed
>> from a suspend sequence. That is not tied to runtime PM.
> 
> AFAICT in_suspend / in_runpm are only set from resume / suspend
> sequences, so checking runtime_status != RPM_ACTIVE like
> pm_runtime_get_if_in_use does should provide the same result.
> 
> (= during resume the device status is RPM_RESUMING)
> 
> Pierre-Eric
> 

On devices whose runtime PM is forbidden, I'm not sure if it goes
through RPM_ state changes or just statically remains at RPM_ACTIVE.

Regardless, these checks are removed only for sys/debug fs attributes.
Hence as Alex mentioned this access check was not required in the first
place.

Thanks,
Lijo
> 
>>
>> Thanks,
>> Lijo
>>
>>>   ret = pm_runtime_get_if_active(ddev->dev, true);
>>>   if (ret <= 0)
>>> @@ -265,8 +263,6 @@ static ssize_t
>>> amdgpu_get_power_dpm_force_performance_level(struct device *dev,
>>>     if (amdgpu_in_reset(adev))
>>>   return -EPERM;
>>> -    if (adev->in_suspend && !adev->in_runpm)
>>> -    return -EPERM;
>>>     ret = pm_runtime_get_if_active(ddev->dev, true);
>>>   if (ret <= 0)
>>> @@ -361,8 +357,6 @@ static ssize_t amdgpu_get_pp_num_states(struct
>>> device *dev,
>>>     if (amdgpu_in_reset(adev))
>>>   return -EPERM;
>>> -    if (adev->in_suspend && !adev->in_runpm)
>>> -    return -EPERM;
>>>     ret = pm_runtime_get_if_active(ddev->dev, true);
>>>   if (ret <= 0)
>>> @@ -396,8 +390,6 @@ static ssize_t amdgpu_get_pp_cur_state(struct
>>> device *dev,
>>>     if (amdgpu_in_reset(adev))
>>>   return -EPERM;
>>> -    if (adev->in_suspend && !adev->in_runpm)
>>> -    return -EPERM;
>>>     ret = pm_runtime_get_if_active(ddev->dev, true);
>>>   if (ret <= 0)
>>> @@ -523,8 +515,6 @@ static ssize_t amdgpu_get_pp_table(struct device
>>> *dev,
>>>     if (amdgpu_in_reset(adev))
>>>   return -EPERM;
>>> -    if (adev->in_suspend && !adev->in_runpm)
>>> -    return -EPERM;
>>>     ret = pm_runtime_get_if_active(ddev->dev, true);
>>>   if (ret <= 0)
>>> @@ -837,8 +827,6 @@ static ssize_t
>>> amdgpu_get_pp_od_clk_voltage(struct device *dev,
>>>     if (amdgpu_in_reset(adev))
>>>   return -EPERM;
>>> -    if (adev->in_suspend && !adev->in_runpm)
>>> -    return -EPERM;
>>>     ret = pm_runtime_get_if_active(ddev->dev, true);
>>>   if (ret <= 0)
>>> @@ -927,8 +915,6 @@ static ssize_t amdgpu_get_pp_features(struct
>>> device *dev,
>>>     if (amdgpu_in_reset(adev))
>>>   return -EPERM;
>>> -    if (adev->in_suspend && !adev->in_runpm)
>>> -    return -EPERM;
>>>     ret = pm_runtime_get_if_active(ddev->dev, true);
>>>   if (ret <= 0)
>>> @@ -993,8 +979,6 @@ static ssize_t amdgpu_get_pp_dpm_clock(struct
>>> device *dev,
>>>     if (amdgpu_in_reset(adev))
>>>   return -EPERM;
>>> -    if (adev->in_suspend && !adev->in_runpm)
>>> -    return -EPERM;
>>>     ret = pm_runtime_get_if_active(ddev->dev, true);
>>>   if (ret <= 0)
>>> @@ -1242,8 +1226,6 @@ static ssize_t amdgpu_get_pp_sclk_od(struct
>>> device *dev,
>>>     if (amdgpu_in_reset(adev))
>>>   return -EPERM;
>>> -    if (adev->in_suspend && !adev->in_runpm)
>>> -    return -EPERM;
>>>     ret = pm_runtime_get_if_active(ddev->dev, true);
>>>   if (ret <= 0)
>>> @@ -1299,8 +1281,6 @@ static ssize_t amdgpu_get_pp_mclk_od(struct
>>> device *dev,
>>>     if (amdgpu_in_reset(adev))
>>>   return -EPERM;
>>> -    if (adev->in_suspend && !adev->in_runpm)
>>> -    return -EPERM;
>>>     ret = pm_runtime_get_if_active(ddev->dev, true);
>>>   if (ret <= 0)
>>> @@ -1376,8 +1356,6 @@ static ssize_t
>>> amdgpu_get_pp_power_profile_mode(struct device *dev,
>>>     if (amdgpu_in_reset(adev))
>>>   return -EPERM;
>>> -    if (adev->in_suspend && !adev->in_runpm)
>>> -

Re: [PATCH v2 01/08] drm/amdgpu: Implement userqueue signal/wait IOCTL

2024-09-26 Thread Paneer Selvam, Arunpravin

Hi Christian,

On 9/26/2024 3:04 PM, Christian König wrote:

Am 26.09.24 um 11:31 schrieb Paneer Selvam, Arunpravin:

Hi Christian,

On 9/26/2024 2:57 PM, Christian König wrote:

Am 25.09.24 um 21:59 schrieb Arunpravin Paneer Selvam:

[SNIP]
+int amdgpu_userq_wait_ioctl(struct drm_device *dev, void *data,
+    struct drm_file *filp)
+{
+    struct drm_amdgpu_userq_fence_info *fence_info = NULL;
+    struct drm_amdgpu_userq_wait *wait_info = data;
+    u32 *syncobj_handles, *bo_handles;
+    struct dma_fence **fences = NULL;
+    u32 num_syncobj, num_bo_handles;
+    struct drm_gem_object **gobj;
+    struct drm_exec exec;
+    int r, i, entry, cnt;
+    u64 num_fences = 0;
+
+    num_bo_handles = wait_info->num_bo_handles;
+    bo_handles = 
memdup_user(u64_to_user_ptr(wait_info->bo_handles_array),

+ sizeof(u32) * num_bo_handles);
+    if (IS_ERR(bo_handles))
+    return PTR_ERR(bo_handles);
+
+    num_syncobj = wait_info->num_syncobj_handles;
+    syncobj_handles = 
memdup_user(u64_to_user_ptr(wait_info->syncobj_handles_array),

+  sizeof(u32) * num_syncobj);
+    if (IS_ERR(syncobj_handles)) {
+    r = PTR_ERR(syncobj_handles);
+    goto free_bo_handles;
+    }
+
+    /* Array of GEM object handles */
+    gobj = kmalloc_array(num_bo_handles, sizeof(*gobj), GFP_KERNEL);
+    if (!gobj) {
+    r = -ENOMEM;
+    goto free_syncobj_handles;
+    }
+
+    for (entry = 0; entry < num_bo_handles; entry++) {
+    gobj[entry] = drm_gem_object_lookup(filp, bo_handles[entry]);
+    if (!gobj[entry]) {
+    r = -ENOENT;
+    goto put_gobj;
+    }
+    }
+
+    drm_exec_init(&exec, DRM_EXEC_INTERRUPTIBLE_WAIT, 0);
+    drm_exec_until_all_locked(&exec) {
+    r = drm_exec_prepare_array(&exec, gobj, num_bo_handles, 0);
+    drm_exec_retry_on_contention(&exec);
+    if (r) {
+    drm_exec_fini(&exec);
+    goto put_gobj;
+    }
+    }
+
+    if (!wait_info->num_fences) {
+    /* Count syncobj's fence */
+    for (i = 0; i < num_syncobj; i++) {
+    struct dma_fence *fence;
+
+    r = drm_syncobj_find_fence(filp, syncobj_handles[i],
+   0, 0, &fence);
+    dma_fence_put(fence);
+
+    if (r || !fence)
+    continue;
+
+    num_fences++;
+    }
+
+    /* Count GEM objects fence */
+    for (i = 0; i < num_bo_handles; i++) {
+    struct dma_resv_iter resv_cursor;
+    struct dma_fence *fence;
+
+    dma_resv_for_each_fence(&resv_cursor, gobj[i]->resv,
+ dma_resv_usage_rw(wait_info->bo_wait_flags &
+    AMDGPU_USERQ_BO_WRITE), fence)
+    num_fences++;


We should probably adjust the UAPI here once more.

The problem is that we only provide the AMDGPU_USERQ_BO_WRITE for 
the whole IOCTL instead of per BO.


So the best approach would probably be to drop the 
AMDGPU_USERQ_BO_WRITE flag and split up the array of BOs into 
readers and writers.


Can you work on that Arun? Shouldn't be more than a bit typing 
exercise.

Sure, I will modify and send the next version of this file.


Thanks.

In the meantime I'm going to review the rest of the series, so there 
could be more comments. But please update the UAPI first.


Can we have the bo_handles_read_array, num_read_bo_handles, 
bo_handles_write_array, num_write_bo_handles in both

signal IOCTL and wait IOCTL?

Thanks,
Arun.


Regards,
Christian.



Thanks,
Arun.


Thanks,
Christian.









Re: [PATCH v7 09/10] drm/amd/display: remove dc_edid handler from dm_helpers_parse_edid_caps

2024-09-26 Thread Melissa Wen





On 25/09/2024 14:55, Mario Limonciello wrote:

Alex,

Unfortunately I can't reproduce the regression on the APU I tried. 
However I do have a suspicion on a fix.


Can you see if this helps?  If it does, we can squash it in.

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c

index bf847ac55475..e75cd527d753 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -7008,6 +7008,7 @@ static void amdgpu_dm_connector_destroy(struct 
drm_connector *connector)

    kfree(aconnector->i2c);
    }
    kfree(aconnector->dm_dp_aux.aux.name);
+   drm_edid_free(aconnector->drm_edid);

    kfree(connector);
 }

If that doesn't help, I did test dropping this patch and it doesn't 
affect the last patch in the series, that one still works so I'm fine 
with dropping it and we can follow up later.

yes, I agree with Mario's proposal.


On 9/25/2024 12:06, Alex Hung wrote:

Mario and Melissa,

This patch causes a regrerssion on 7900 XTX in an IGT test: 
amd_mem_leak's connector-suspend-resume.


Is this patch necessary on this series or is it independent from 
other patches, i.e. can it be dropped from this series until fixed??


Cheers,
Alex Hung

On 9/18/24 15:38, Mario Limonciello wrote:

From: Melissa Wen 

We can parse edid caps from drm_edid and drm_eld and any calls of
dm_helpers_parse_edid_caps is made in a state that we have drm_edid set
to amdgpu connector.

Signed-off-by: Melissa Wen 
Co-developed-by: Mario Limonciello 
Signed-off-by: Mario Limonciello 
---
  .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c |  5 +---
  .../amd/display/amdgpu_dm/amdgpu_dm_helpers.c | 30 
---

  drivers/gpu/drm/amd/display/dc/dm_helpers.h   |  1 -
  .../drm/amd/display/dc/link/link_detection.c  |  6 ++--
  4 files changed, 16 insertions(+), 26 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/ 
drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c

index bd8fb9968c7c..bf847ac55475 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -7123,10 +7123,7 @@ static void 
amdgpu_dm_connector_funcs_force(struct drm_connector *connector)
  memset(&dc_em_sink->edid_caps, 0, sizeof(struct 
dc_edid_caps));

  memmove(dc_em_sink->dc_edid.raw_edid, edid,
  (edid->extensions + 1) * EDID_LENGTH);
-    dm_helpers_parse_edid_caps(
-    dc_link,
-    &dc_em_sink->dc_edid,
-    &dc_em_sink->edid_caps);
+    dm_helpers_parse_edid_caps(dc_link, &dc_em_sink->edid_caps);
  }
  }
diff --git 
a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c

index d43ed3ea000b..8f4be7a1ec45 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c
@@ -83,32 +83,24 @@ static void apply_edid_quirks(struct 
drm_edid_product_id *product_id,

   * dm_helpers_parse_edid_caps() - Parse edid caps
   *
   * @link: current detected link
- * @edid:    [in] pointer to edid
   * @edid_caps:    [in] pointer to edid caps
   *
   * Return: void
   */
-enum dc_edid_status dm_helpers_parse_edid_caps(
-    struct dc_link *link,
-    const struct dc_edid *edid,
-    struct dc_edid_caps *edid_caps)
+enum dc_edid_status dm_helpers_parse_edid_caps(struct dc_link *link,
+   struct dc_edid_caps *edid_caps)
  {
  struct amdgpu_dm_connector *aconnector = link->priv;
  struct drm_connector *connector = &aconnector->base;
  const struct drm_edid *drm_edid = aconnector->drm_edid;
  struct drm_edid_product_id product_id;
-    struct edid *edid_buf = edid ? (struct edid *) edid->raw_edid : 
NULL;

  int sad_count;
  int i = 0;
-
  enum dc_edid_status result = EDID_OK;
-    if (!edid_caps || !edid)
+    if (!edid_caps || !drm_edid)
  return EDID_BAD_INPUT;
-    if (!drm_edid_is_valid(edid_buf))
-    result = EDID_BAD_CHECKSUM;
-
  drm_edid_get_product_id(drm_edid, &product_id);
  edid_caps->manufacturer_id = 
le16_to_cpu(product_id.manufacturer_name);

@@ -920,19 +912,23 @@ enum dc_edid_status dm_helpers_read_local_edid(
  if (!drm_edid)
  return EDID_NO_RESPONSE;
-    edid = drm_edid_raw(drm_edid); // FIXME: Get rid of 
drm_edid_raw()

+    /* FIXME: Get rid of drm_edid_raw()
+ * Raw edid is still needed for dm_helpers_dp_write_dpcd()
+ */
+    edid = drm_edid_raw(drm_edid);
  sink->dc_edid.length = EDID_LENGTH * (edid->extensions + 1);
  memmove(sink->dc_edid.raw_edid, (uint8_t *)edid, sink- 
>dc_edid.length);

  edid_status = dm_helpers_parse_edid_caps(
  link,
-    &sink->dc_edid,
  &sink->edid_caps);
-  

Re: [PATCH] drm/amdgpu: fix vbios fetching for SR-IOV

2024-09-26 Thread Lazar, Lijo



On 9/25/2024 11:50 PM, Alex Deucher wrote:
> SR-IOV fetches the vbios from VRAM in some cases.
> Re-enable the VRAM path for dGPUs and rename the function
> to make it clear that it is not IGP specific.
> 
> Fixes: 042658d17a54 ("drm/amdgpu: clean up vbios fetching code")
> Signed-off-by: Alex Deucher 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_bios.c | 11 +--
>  1 file changed, 9 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_bios.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_bios.c
> index 46bf623919d7c..45affc02548c1 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_bios.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_bios.c
> @@ -87,8 +87,9 @@ static bool check_atom_bios(uint8_t *bios, size_t size)
>   * part of the system bios.  On boot, the system bios puts a
>   * copy of the igp rom at the start of vram if a discrete card is
>   * present.
> + * For SR-IOV, the vbios image is also put in VRAM in the VF.
>   */
> -static bool igp_read_bios_from_vram(struct amdgpu_device *adev)
> +static bool amdgpu_read_bios_from_vram(struct amdgpu_device *adev)
>  {
>   uint8_t __iomem *bios;
>   resource_size_t vram_base;
> @@ -414,7 +415,7 @@ static bool amdgpu_get_bios_apu(struct amdgpu_device 
> *adev)
>   goto success;
>   }
>  
> - if (igp_read_bios_from_vram(adev)) {
> + if (amdgpu_read_bios_from_vram(adev)) {
>   dev_info(adev->dev, "Fetched VBIOS from VRAM BAR\n");
>   goto success;
>   }
> @@ -448,6 +449,12 @@ static bool amdgpu_get_bios_dgpu(struct amdgpu_device 
> *adev)
>   goto success;
>   }
>  
> + /* this is required for SR-IOV */
> + if (amdgpu_read_bios_from_vram(adev)) {

If this is only expected for VFs, then it's better to add
amdgpu_sriov_vf(adev) check for this call.

Thanks,
Lijo

> + dev_info(adev->dev, "Fetched VBIOS from VRAM BAR\n");
> + goto success;
> + }
> +
>   if (amdgpu_read_platform_bios(adev)) {
>   dev_info(adev->dev, "Fetched VBIOS from platform\n");
>   goto success;


Re: [PATCH v2 01/08] drm/amdgpu: Implement userqueue signal/wait IOCTL

2024-09-26 Thread Christian König

Am 26.09.24 um 12:26 schrieb Paneer Selvam, Arunpravin:

Hi Christian,

On 9/26/2024 3:04 PM, Christian König wrote:

Am 26.09.24 um 11:31 schrieb Paneer Selvam, Arunpravin:

Hi Christian,

On 9/26/2024 2:57 PM, Christian König wrote:

Am 25.09.24 um 21:59 schrieb Arunpravin Paneer Selvam:

[SNIP]
+int amdgpu_userq_wait_ioctl(struct drm_device *dev, void *data,
+    struct drm_file *filp)
+{
+    struct drm_amdgpu_userq_fence_info *fence_info = NULL;
+    struct drm_amdgpu_userq_wait *wait_info = data;
+    u32 *syncobj_handles, *bo_handles;
+    struct dma_fence **fences = NULL;
+    u32 num_syncobj, num_bo_handles;
+    struct drm_gem_object **gobj;
+    struct drm_exec exec;
+    int r, i, entry, cnt;
+    u64 num_fences = 0;
+
+    num_bo_handles = wait_info->num_bo_handles;
+    bo_handles = 
memdup_user(u64_to_user_ptr(wait_info->bo_handles_array),

+ sizeof(u32) * num_bo_handles);
+    if (IS_ERR(bo_handles))
+    return PTR_ERR(bo_handles);
+
+    num_syncobj = wait_info->num_syncobj_handles;
+    syncobj_handles = 
memdup_user(u64_to_user_ptr(wait_info->syncobj_handles_array),

+  sizeof(u32) * num_syncobj);
+    if (IS_ERR(syncobj_handles)) {
+    r = PTR_ERR(syncobj_handles);
+    goto free_bo_handles;
+    }
+
+    /* Array of GEM object handles */
+    gobj = kmalloc_array(num_bo_handles, sizeof(*gobj), GFP_KERNEL);
+    if (!gobj) {
+    r = -ENOMEM;
+    goto free_syncobj_handles;
+    }
+
+    for (entry = 0; entry < num_bo_handles; entry++) {
+    gobj[entry] = drm_gem_object_lookup(filp, 
bo_handles[entry]);

+    if (!gobj[entry]) {
+    r = -ENOENT;
+    goto put_gobj;
+    }
+    }
+
+    drm_exec_init(&exec, DRM_EXEC_INTERRUPTIBLE_WAIT, 0);
+    drm_exec_until_all_locked(&exec) {
+    r = drm_exec_prepare_array(&exec, gobj, num_bo_handles, 0);
+    drm_exec_retry_on_contention(&exec);
+    if (r) {
+    drm_exec_fini(&exec);
+    goto put_gobj;
+    }
+    }
+
+    if (!wait_info->num_fences) {
+    /* Count syncobj's fence */
+    for (i = 0; i < num_syncobj; i++) {
+    struct dma_fence *fence;
+
+    r = drm_syncobj_find_fence(filp, syncobj_handles[i],
+   0, 0, &fence);
+    dma_fence_put(fence);
+
+    if (r || !fence)
+    continue;
+
+    num_fences++;
+    }
+
+    /* Count GEM objects fence */
+    for (i = 0; i < num_bo_handles; i++) {
+    struct dma_resv_iter resv_cursor;
+    struct dma_fence *fence;
+
+    dma_resv_for_each_fence(&resv_cursor, gobj[i]->resv,
+ dma_resv_usage_rw(wait_info->bo_wait_flags &
+    AMDGPU_USERQ_BO_WRITE), fence)
+    num_fences++;


We should probably adjust the UAPI here once more.

The problem is that we only provide the AMDGPU_USERQ_BO_WRITE for 
the whole IOCTL instead of per BO.


So the best approach would probably be to drop the 
AMDGPU_USERQ_BO_WRITE flag and split up the array of BOs into 
readers and writers.


Can you work on that Arun? Shouldn't be more than a bit typing 
exercise.

Sure, I will modify and send the next version of this file.


Thanks.

In the meantime I'm going to review the rest of the series, so there 
could be more comments. But please update the UAPI first.


Can we have the bo_handles_read_array, num_read_bo_handles, 
bo_handles_write_array, num_write_bo_handles in both

signal IOCTL and wait IOCTL?


I think so, yes.

Regards,
Christian.



Thanks,
Arun.


Regards,
Christian.



Thanks,
Arun.


Thanks,
Christian.











[PATCH v2] docs/gpu: ci: update flake tests requirements

2024-09-26 Thread Vignesh Raman
Update the documentation to require linking to a relevant GitLab
issue for each new flake entry instead of an email report. Added
specific GitLab issue URLs for i915, xe and other drivers.

Signed-off-by: Vignesh Raman 
---

v2:
- Add gitlab issue link for msm driver.

---
 Documentation/gpu/automated_testing.rst | 16 +++-
 1 file changed, 11 insertions(+), 5 deletions(-)

diff --git a/Documentation/gpu/automated_testing.rst 
b/Documentation/gpu/automated_testing.rst
index 2d5a28866afe..f918fe56f2b0 100644
--- a/Documentation/gpu/automated_testing.rst
+++ b/Documentation/gpu/automated_testing.rst
@@ -67,20 +67,26 @@ Lists the tests that for a given driver on a specific 
hardware revision are
 known to behave unreliably. These tests won't cause a job to fail regardless of
 the result. They will still be run.
 
-Each new flake entry must be associated with a link to the email reporting the
-bug to the author of the affected driver, the board name or Device Tree name of
-the board, the first kernel version affected, the IGT version used for tests,
-and an approximation of the failure rate.
+Each new flake entry must include a link to the relevant GitLab issue, the 
board
+name or Device Tree name, the first kernel version affected, the IGT version 
used
+for tests and an approximation of the failure rate.
 
 They should be provided under the following format::
 
-  # Bug Report: $LORE_OR_PATCHWORK_URL
+  # Bug Report: $GITLAB_ISSUE
   # Board Name: broken-board.dtb
   # Linux Version: 6.6-rc1
   # IGT Version: 1.28-gd2af13d9f
   # Failure Rate: 100
   flaky-test
 
+The GitLab issue must include the logs and the pipeline link. Use the 
appropriate
+link below to create an issue.
+https://gitlab.freedesktop.org/drm/i915/kernel/-/issues for i915 driver
+https://gitlab.freedesktop.org/drm/xe/kernel/-/issues for xe driver
+https://gitlab.freedesktop.org/drm/msm/-/issues for msm driver
+https://gitlab.freedesktop.org/drm/misc/kernel/-/issues for other drivers
+
 drivers/gpu/drm/ci/${DRIVER_NAME}-${HW_REVISION}-skips.txt
 ---
 
-- 
2.43.0



Re: [PATCH v2] docs/gpu: ci: update flake tests requirements

2024-09-26 Thread Maxime Ripard
On Thu, Sep 26, 2024 at 12:36:49PM GMT, Vignesh Raman wrote:
> Update the documentation to require linking to a relevant GitLab
> issue for each new flake entry instead of an email report. Added
> specific GitLab issue URLs for i915, xe and other drivers.
> 
> Signed-off-by: Vignesh Raman 
> ---
> 
> v2:
> - Add gitlab issue link for msm driver.
> 
> ---
>  Documentation/gpu/automated_testing.rst | 16 +++-
>  1 file changed, 11 insertions(+), 5 deletions(-)
> 
> diff --git a/Documentation/gpu/automated_testing.rst 
> b/Documentation/gpu/automated_testing.rst
> index 2d5a28866afe..f918fe56f2b0 100644
> --- a/Documentation/gpu/automated_testing.rst
> +++ b/Documentation/gpu/automated_testing.rst
> @@ -67,20 +67,26 @@ Lists the tests that for a given driver on a specific 
> hardware revision are
>  known to behave unreliably. These tests won't cause a job to fail regardless 
> of
>  the result. They will still be run.
>  
> -Each new flake entry must be associated with a link to the email reporting 
> the
> -bug to the author of the affected driver, the board name or Device Tree name 
> of
> -the board, the first kernel version affected, the IGT version used for tests,
> -and an approximation of the failure rate.
> +Each new flake entry must include a link to the relevant GitLab issue, the 
> board
> +name or Device Tree name, the first kernel version affected, the IGT version 
> used
> +for tests and an approximation of the failure rate.
>  
>  They should be provided under the following format::
>  
> -  # Bug Report: $LORE_OR_PATCHWORK_URL
> +  # Bug Report: $GITLAB_ISSUE
># Board Name: broken-board.dtb
># Linux Version: 6.6-rc1
># IGT Version: 1.28-gd2af13d9f
># Failure Rate: 100
>flaky-test
>  
> +The GitLab issue must include the logs and the pipeline link. Use the 
> appropriate
> +link below to create an issue.
> +https://gitlab.freedesktop.org/drm/i915/kernel/-/issues for i915 driver
> +https://gitlab.freedesktop.org/drm/xe/kernel/-/issues for xe driver
> +https://gitlab.freedesktop.org/drm/msm/-/issues for msm driver
> +https://gitlab.freedesktop.org/drm/misc/kernel/-/issues for other drivers
> +

I can't comment for the others, but drm-misc at least still requires
reporting issues by mail, so, no, sorry, we can't switch to gitlab only
for now.

Maxime


signature.asc
Description: PGP signature


Re: [PATCH v2] docs/gpu: ci: update flake tests requirements

2024-09-26 Thread Dmitry Baryshkov
On Thu, Sep 26, 2024 at 12:36:49PM GMT, Vignesh Raman wrote:
> Update the documentation to require linking to a relevant GitLab
> issue for each new flake entry instead of an email report. Added
> specific GitLab issue URLs for i915, xe and other drivers.
> 
> Signed-off-by: Vignesh Raman 
> ---
> 
> v2:
> - Add gitlab issue link for msm driver.
> 
> ---
>  Documentation/gpu/automated_testing.rst | 16 +++-
>  1 file changed, 11 insertions(+), 5 deletions(-)
> 
> diff --git a/Documentation/gpu/automated_testing.rst 
> b/Documentation/gpu/automated_testing.rst
> index 2d5a28866afe..f918fe56f2b0 100644
> --- a/Documentation/gpu/automated_testing.rst
> +++ b/Documentation/gpu/automated_testing.rst
> @@ -67,20 +67,26 @@ Lists the tests that for a given driver on a specific 
> hardware revision are
>  known to behave unreliably. These tests won't cause a job to fail regardless 
> of
>  the result. They will still be run.
>  
> -Each new flake entry must be associated with a link to the email reporting 
> the
> -bug to the author of the affected driver, the board name or Device Tree name 
> of
> -the board, the first kernel version affected, the IGT version used for tests,
> -and an approximation of the failure rate.
> +Each new flake entry must include a link to the relevant GitLab issue, the 
> board
> +name or Device Tree name, the first kernel version affected, the IGT version 
> used
> +for tests and an approximation of the failure rate.
>  
>  They should be provided under the following format::
>  
> -  # Bug Report: $LORE_OR_PATCHWORK_URL
> +  # Bug Report: $GITLAB_ISSUE
># Board Name: broken-board.dtb
># Linux Version: 6.6-rc1
># IGT Version: 1.28-gd2af13d9f
># Failure Rate: 100
>flaky-test
>  
> +The GitLab issue must include the logs and the pipeline link. Use the 
> appropriate
> +link below to create an issue.
> +https://gitlab.freedesktop.org/drm/i915/kernel/-/issues for i915 driver
> +https://gitlab.freedesktop.org/drm/xe/kernel/-/issues for xe driver
> +https://gitlab.freedesktop.org/drm/msm/-/issues for msm driver

Ack for the MSM part.

> +https://gitlab.freedesktop.org/drm/misc/kernel/-/issues for other drivers

-- 
With best wishes
Dmitry


[PATCH v2 1/4] drm/amdgpu/sdma5: split out per instance resume function

2024-09-26 Thread jiadong.zhu
From: Jiadong Zhu 

Extract the resume sequence from sdma_v5_0_gfx_resume for
starting/restarting an individual instance.

Signed-off-by: Jiadong Zhu 
Acked-by: Alex Deucher 
---
 drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c | 253 ++---
 1 file changed, 138 insertions(+), 115 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c 
b/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c
index 3e48ea38385d..e813da1e48aa 100644
--- a/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c
@@ -705,14 +705,16 @@ static void sdma_v5_0_enable(struct amdgpu_device *adev, 
bool enable)
 }
 
 /**
- * sdma_v5_0_gfx_resume - setup and start the async dma engines
+ * sdma_v5_0_gfx_resume_instance - start/restart a certain sdma engine
  *
  * @adev: amdgpu_device pointer
+ * @i: instance
+ * @restore: used to restore wptr when restart
  *
- * Set up the gfx DMA ring buffers and enable them (NAVI10).
- * Returns 0 for success, error for failure.
+ * Set up the gfx DMA ring buffers and enable them. On restart, we will 
restore wptr and rptr.
+ * Return 0 for success.
  */
-static int sdma_v5_0_gfx_resume(struct amdgpu_device *adev)
+static int sdma_v5_0_gfx_resume_instance(struct amdgpu_device *adev, int i, 
bool restore)
 {
struct amdgpu_ring *ring;
u32 rb_cntl, ib_cntl;
@@ -722,142 +724,163 @@ static int sdma_v5_0_gfx_resume(struct amdgpu_device 
*adev)
u32 temp;
u32 wptr_poll_cntl;
u64 wptr_gpu_addr;
-   int i, r;
 
-   for (i = 0; i < adev->sdma.num_instances; i++) {
-   ring = &adev->sdma.instance[i].ring;
+   ring = &adev->sdma.instance[i].ring;
 
-   if (!amdgpu_sriov_vf(adev))
-   WREG32(sdma_v5_0_get_reg_offset(adev, i, 
mmSDMA0_SEM_WAIT_FAIL_TIMER_CNTL), 0);
+   if (!amdgpu_sriov_vf(adev))
+   WREG32(sdma_v5_0_get_reg_offset(adev, i, 
mmSDMA0_SEM_WAIT_FAIL_TIMER_CNTL), 0);
 
-   /* Set ring buffer size in dwords */
-   rb_bufsz = order_base_2(ring->ring_size / 4);
-   rb_cntl = RREG32_SOC15_IP(GC, sdma_v5_0_get_reg_offset(adev, i, 
mmSDMA0_GFX_RB_CNTL));
-   rb_cntl = REG_SET_FIELD(rb_cntl, SDMA0_GFX_RB_CNTL, RB_SIZE, 
rb_bufsz);
+   /* Set ring buffer size in dwords */
+   rb_bufsz = order_base_2(ring->ring_size / 4);
+   rb_cntl = RREG32_SOC15_IP(GC, sdma_v5_0_get_reg_offset(adev, i, 
mmSDMA0_GFX_RB_CNTL));
+   rb_cntl = REG_SET_FIELD(rb_cntl, SDMA0_GFX_RB_CNTL, RB_SIZE, rb_bufsz);
 #ifdef __BIG_ENDIAN
-   rb_cntl = REG_SET_FIELD(rb_cntl, SDMA0_GFX_RB_CNTL, 
RB_SWAP_ENABLE, 1);
-   rb_cntl = REG_SET_FIELD(rb_cntl, SDMA0_GFX_RB_CNTL,
-   RPTR_WRITEBACK_SWAP_ENABLE, 1);
+   rb_cntl = REG_SET_FIELD(rb_cntl, SDMA0_GFX_RB_CNTL, RB_SWAP_ENABLE, 1);
+   rb_cntl = REG_SET_FIELD(rb_cntl, SDMA0_GFX_RB_CNTL,
+   RPTR_WRITEBACK_SWAP_ENABLE, 1);
 #endif
-   WREG32_SOC15_IP(GC, sdma_v5_0_get_reg_offset(adev, i, 
mmSDMA0_GFX_RB_CNTL), rb_cntl);
-
-   /* Initialize the ring buffer's read and write pointers */
+   WREG32_SOC15_IP(GC, sdma_v5_0_get_reg_offset(adev, i, 
mmSDMA0_GFX_RB_CNTL), rb_cntl);
+
+   /* Initialize the ring buffer's read and write pointers */
+   if (restore) {
+   WREG32_SOC15_IP(GC, sdma_v5_0_get_reg_offset(adev, i, 
mmSDMA0_GFX_RB_RPTR), lower_32_bits(ring->wptr << 2));
+   WREG32_SOC15_IP(GC, sdma_v5_0_get_reg_offset(adev, i, 
mmSDMA0_GFX_RB_RPTR_HI), upper_32_bits(ring->wptr << 2));
+   WREG32_SOC15_IP(GC, sdma_v5_0_get_reg_offset(adev, i, 
mmSDMA0_GFX_RB_WPTR), lower_32_bits(ring->wptr << 2));
+   WREG32_SOC15_IP(GC, sdma_v5_0_get_reg_offset(adev, i, 
mmSDMA0_GFX_RB_WPTR_HI), upper_32_bits(ring->wptr << 2));
+   } else {
WREG32_SOC15_IP(GC, sdma_v5_0_get_reg_offset(adev, i, 
mmSDMA0_GFX_RB_RPTR), 0);
WREG32_SOC15_IP(GC, sdma_v5_0_get_reg_offset(adev, i, 
mmSDMA0_GFX_RB_RPTR_HI), 0);
WREG32_SOC15_IP(GC, sdma_v5_0_get_reg_offset(adev, i, 
mmSDMA0_GFX_RB_WPTR), 0);
WREG32_SOC15_IP(GC, sdma_v5_0_get_reg_offset(adev, i, 
mmSDMA0_GFX_RB_WPTR_HI), 0);
-
-   /* setup the wptr shadow polling */
-   wptr_gpu_addr = ring->wptr_gpu_addr;
-   WREG32_SOC15_IP(GC, sdma_v5_0_get_reg_offset(adev, i, 
mmSDMA0_GFX_RB_WPTR_POLL_ADDR_LO),
-  lower_32_bits(wptr_gpu_addr));
-   WREG32_SOC15_IP(GC, sdma_v5_0_get_reg_offset(adev, i, 
mmSDMA0_GFX_RB_WPTR_POLL_ADDR_HI),
-  upper_32_bits(wptr_gpu_addr));
-   wptr_poll_cntl = RREG32_SOC15_IP(GC, 
sdma_v5_0_get_reg_offset(adev, i,
-
mmSDMA0_GFX_RB_WPTR_POLL_CNTL));
-   wptr_poll_cntl = REG_SET_FIELD(wptr_poll_cntl,
-  

[PATCH v2 2/4] drm/amdgpu/sdma5: implement ring reset callback for sdma5

2024-09-26 Thread jiadong.zhu
From: Jiadong Zhu 

Implement sdma queue reset callback via MMIO.

v2: enter/exit safemode when sdma queue reset.

Signed-off-by: Jiadong Zhu 
---
 drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c | 88 ++
 1 file changed, 88 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c 
b/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c
index e813da1e48aa..416273d917e0 100644
--- a/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c
@@ -1555,6 +1555,93 @@ static int sdma_v5_0_soft_reset(void *handle)
return 0;
 }
 
+static int sdma_v5_0_reset_queue(struct amdgpu_ring *ring, unsigned int vmid)
+{
+   struct amdgpu_device *adev = ring->adev;
+   int i, j, r;
+   u32 rb_cntl, ib_cntl, f32_cntl, freeze, cntl, preempt, soft_reset, 
stat1_reg;
+
+   if (amdgpu_sriov_vf(adev))
+   return -EINVAL;
+
+   for (i = 0; i < adev->sdma.num_instances; i++) {
+   if (ring == &adev->sdma.instance[i].ring)
+   break;
+   }
+
+   if (i == adev->sdma.num_instances) {
+   DRM_ERROR("sdma instance not found\n");
+   return -EINVAL;
+   }
+
+   amdgpu_gfx_rlc_enter_safe_mode(adev, 0);
+
+   /* stop queue */
+   ib_cntl = RREG32(sdma_v5_0_get_reg_offset(adev, i, 
mmSDMA0_GFX_IB_CNTL));
+   ib_cntl = REG_SET_FIELD(ib_cntl, SDMA0_GFX_IB_CNTL, IB_ENABLE, 0);
+   WREG32(sdma_v5_0_get_reg_offset(adev, i, mmSDMA0_GFX_IB_CNTL), ib_cntl);
+
+   rb_cntl = RREG32(sdma_v5_0_get_reg_offset(adev, i, 
mmSDMA0_GFX_RB_CNTL));
+   rb_cntl = REG_SET_FIELD(rb_cntl, SDMA0_GFX_RB_CNTL, RB_ENABLE, 0);
+   WREG32(sdma_v5_0_get_reg_offset(adev, i, mmSDMA0_GFX_RB_CNTL), rb_cntl);
+
+   /* engine stop SDMA1_F32_CNTL.HALT to 1 and SDMAx_FREEZE freeze bit to 
1 */
+   freeze = RREG32(sdma_v5_0_get_reg_offset(adev, i, mmSDMA0_FREEZE));
+   freeze = REG_SET_FIELD(freeze, SDMA0_FREEZE, FREEZE, 1);
+   WREG32(sdma_v5_0_get_reg_offset(adev, i, mmSDMA0_FREEZE), freeze);
+
+   for (j = 0; j < adev->usec_timeout; j++) {
+   freeze = RREG32(sdma_v5_0_get_reg_offset(adev, i, 
mmSDMA0_FREEZE));
+   if (REG_GET_FIELD(freeze, SDMA0_FREEZE, FROZEN) & 1)
+   break;
+   udelay(1);
+   }
+
+   /* check sdma copy engine all idle if frozen not received*/
+   if (j == adev->usec_timeout) {
+   stat1_reg = RREG32(sdma_v5_0_get_reg_offset(adev, i, 
mmSDMA0_STATUS1_REG));
+   if ((stat1_reg & 0x3FF) != 0x3FF) {
+   DRM_ERROR("cannot soft reset as sdma not idle\n");
+   r = -ETIMEDOUT;
+   goto err0;
+   }
+   }
+
+   f32_cntl = RREG32(sdma_v5_0_get_reg_offset(adev, i, mmSDMA0_F32_CNTL));
+   f32_cntl = REG_SET_FIELD(f32_cntl, SDMA0_F32_CNTL, HALT, 1);
+   WREG32(sdma_v5_0_get_reg_offset(adev, i, mmSDMA0_F32_CNTL), f32_cntl);
+
+   cntl = RREG32(sdma_v5_0_get_reg_offset(adev, i, mmSDMA0_CNTL));
+   cntl = REG_SET_FIELD(cntl, SDMA0_CNTL, UTC_L1_ENABLE, 0);
+   WREG32(sdma_v5_0_get_reg_offset(adev, i, mmSDMA0_CNTL), cntl);
+
+   /* soft reset SDMA_GFX_PREEMPT.IB_PREEMPT = 0 
mmGRBM_SOFT_RESET.SOFT_RESET_SDMA0/1 = 1 */
+   preempt = RREG32(sdma_v5_0_get_reg_offset(adev, i, 
mmSDMA0_GFX_PREEMPT));
+   preempt = REG_SET_FIELD(preempt, SDMA0_GFX_PREEMPT, IB_PREEMPT, 0);
+   WREG32(sdma_v5_0_get_reg_offset(adev, i, mmSDMA0_GFX_PREEMPT), preempt);
+
+   soft_reset = RREG32_SOC15(GC, 0, mmGRBM_SOFT_RESET);
+   soft_reset |= 1 << GRBM_SOFT_RESET__SOFT_RESET_SDMA0__SHIFT << i;
+
+   WREG32_SOC15(GC, 0, mmGRBM_SOFT_RESET, soft_reset);
+
+   udelay(50);
+
+   soft_reset &= ~(1 << GRBM_SOFT_RESET__SOFT_RESET_SDMA0__SHIFT << i);
+   WREG32_SOC15(GC, 0, mmGRBM_SOFT_RESET, soft_reset);
+
+   /* unfreeze*/
+   freeze = RREG32(sdma_v5_0_get_reg_offset(adev, i, mmSDMA0_FREEZE));
+   freeze = REG_SET_FIELD(freeze, SDMA0_FREEZE, FREEZE, 0);
+   WREG32(sdma_v5_0_get_reg_offset(adev, i, mmSDMA0_FREEZE), freeze);
+
+   r = sdma_v5_0_gfx_resume_instance(adev, i, true);
+
+err0:
+   amdgpu_gfx_rlc_exit_safe_mode(adev, 0);
+   return r;
+}
+
 static int sdma_v5_0_ring_preempt_ib(struct amdgpu_ring *ring)
 {
int i, r = 0;
@@ -1897,6 +1984,7 @@ static const struct amdgpu_ring_funcs 
sdma_v5_0_ring_funcs = {
.emit_reg_write_reg_wait = sdma_v5_0_ring_emit_reg_write_reg_wait,
.init_cond_exec = sdma_v5_0_ring_init_cond_exec,
.preempt_ib = sdma_v5_0_ring_preempt_ib,
+   .reset = sdma_v5_0_reset_queue,
 };
 
 static void sdma_v5_0_set_ring_funcs(struct amdgpu_device *adev)
-- 
2.25.1



[PATCH v2 3/4] drm/amdgpu/sdma5.2: split out per instance resume function

2024-09-26 Thread jiadong.zhu
From: Jiadong Zhu 

Extract the resume sequence from sdma_v5_2_gfx_resume for
starting/restarting an individual instance.

Signed-off-by: Jiadong Zhu 
Acked-by: Alex Deucher 
---
 drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c | 247 ++---
 1 file changed, 136 insertions(+), 111 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c 
b/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c
index bc9b240a3488..37534c6f431f 100644
--- a/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c
+++ b/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c
@@ -522,14 +522,17 @@ static void sdma_v5_2_enable(struct amdgpu_device *adev, 
bool enable)
 }
 
 /**
- * sdma_v5_2_gfx_resume - setup and start the async dma engines
+ * sdma_v5_2_gfx_resume_instance - start/restart a certain sdma engine
  *
  * @adev: amdgpu_device pointer
+ * @i: instance
+ * @restore: used to restore wptr when restart
  *
- * Set up the gfx DMA ring buffers and enable them.
- * Returns 0 for success, error for failure.
+ * Set up the gfx DMA ring buffers and enable them. On restart, we will 
restore wptr and rptr.
+ * Return 0 for success.
  */
-static int sdma_v5_2_gfx_resume(struct amdgpu_device *adev)
+
+static int sdma_v5_2_gfx_resume_instance(struct amdgpu_device *adev, int i, 
bool restore)
 {
struct amdgpu_ring *ring;
u32 rb_cntl, ib_cntl;
@@ -539,139 +542,161 @@ static int sdma_v5_2_gfx_resume(struct amdgpu_device 
*adev)
u32 temp;
u32 wptr_poll_cntl;
u64 wptr_gpu_addr;
-   int i, r;
 
-   for (i = 0; i < adev->sdma.num_instances; i++) {
-   ring = &adev->sdma.instance[i].ring;
+   ring = &adev->sdma.instance[i].ring;
 
-   if (!amdgpu_sriov_vf(adev))
-   WREG32_SOC15_IP(GC, sdma_v5_2_get_reg_offset(adev, i, 
mmSDMA0_SEM_WAIT_FAIL_TIMER_CNTL), 0);
+   if (!amdgpu_sriov_vf(adev))
+   WREG32_SOC15_IP(GC, sdma_v5_2_get_reg_offset(adev, i, 
mmSDMA0_SEM_WAIT_FAIL_TIMER_CNTL), 0);
 
-   /* Set ring buffer size in dwords */
-   rb_bufsz = order_base_2(ring->ring_size / 4);
-   rb_cntl = RREG32_SOC15_IP(GC, sdma_v5_2_get_reg_offset(adev, i, 
mmSDMA0_GFX_RB_CNTL));
-   rb_cntl = REG_SET_FIELD(rb_cntl, SDMA0_GFX_RB_CNTL, RB_SIZE, 
rb_bufsz);
+   /* Set ring buffer size in dwords */
+   rb_bufsz = order_base_2(ring->ring_size / 4);
+   rb_cntl = RREG32_SOC15_IP(GC, sdma_v5_2_get_reg_offset(adev, i, 
mmSDMA0_GFX_RB_CNTL));
+   rb_cntl = REG_SET_FIELD(rb_cntl, SDMA0_GFX_RB_CNTL, RB_SIZE, rb_bufsz);
 #ifdef __BIG_ENDIAN
-   rb_cntl = REG_SET_FIELD(rb_cntl, SDMA0_GFX_RB_CNTL, 
RB_SWAP_ENABLE, 1);
-   rb_cntl = REG_SET_FIELD(rb_cntl, SDMA0_GFX_RB_CNTL,
-   RPTR_WRITEBACK_SWAP_ENABLE, 1);
+   rb_cntl = REG_SET_FIELD(rb_cntl, SDMA0_GFX_RB_CNTL, RB_SWAP_ENABLE, 1);
+   rb_cntl = REG_SET_FIELD(rb_cntl, SDMA0_GFX_RB_CNTL,
+   RPTR_WRITEBACK_SWAP_ENABLE, 1);
 #endif
-   WREG32_SOC15_IP(GC, sdma_v5_2_get_reg_offset(adev, i, 
mmSDMA0_GFX_RB_CNTL), rb_cntl);
-
-   /* Initialize the ring buffer's read and write pointers */
+   WREG32_SOC15_IP(GC, sdma_v5_2_get_reg_offset(adev, i, 
mmSDMA0_GFX_RB_CNTL), rb_cntl);
+
+   /* Initialize the ring buffer's read and write pointers */
+   if (restore) {
+   WREG32_SOC15_IP(GC, sdma_v5_2_get_reg_offset(adev, i, 
mmSDMA0_GFX_RB_RPTR), lower_32_bits(ring->wptr << 2));
+   WREG32_SOC15_IP(GC, sdma_v5_2_get_reg_offset(adev, i, 
mmSDMA0_GFX_RB_RPTR_HI), upper_32_bits(ring->wptr << 2));
+   WREG32_SOC15_IP(GC, sdma_v5_2_get_reg_offset(adev, i, 
mmSDMA0_GFX_RB_WPTR), lower_32_bits(ring->wptr << 2));
+   WREG32_SOC15_IP(GC, sdma_v5_2_get_reg_offset(adev, i, 
mmSDMA0_GFX_RB_WPTR_HI), upper_32_bits(ring->wptr << 2));
+   } else {
WREG32_SOC15_IP(GC, sdma_v5_2_get_reg_offset(adev, i, 
mmSDMA0_GFX_RB_RPTR), 0);
WREG32_SOC15_IP(GC, sdma_v5_2_get_reg_offset(adev, i, 
mmSDMA0_GFX_RB_RPTR_HI), 0);
WREG32_SOC15_IP(GC, sdma_v5_2_get_reg_offset(adev, i, 
mmSDMA0_GFX_RB_WPTR), 0);
WREG32_SOC15_IP(GC, sdma_v5_2_get_reg_offset(adev, i, 
mmSDMA0_GFX_RB_WPTR_HI), 0);
+   }
 
-   /* setup the wptr shadow polling */
-   wptr_gpu_addr = ring->wptr_gpu_addr;
-   WREG32_SOC15_IP(GC, sdma_v5_2_get_reg_offset(adev, i, 
mmSDMA0_GFX_RB_WPTR_POLL_ADDR_LO),
-  lower_32_bits(wptr_gpu_addr));
-   WREG32_SOC15_IP(GC, sdma_v5_2_get_reg_offset(adev, i, 
mmSDMA0_GFX_RB_WPTR_POLL_ADDR_HI),
-  upper_32_bits(wptr_gpu_addr));
-   wptr_poll_cntl = RREG32_SOC15_IP(GC, 
sdma_v5_2_get_reg_offset(adev, i,
-
mmSDMA0_GFX_RB_WPTR_POLL_CNTL));
-   wptr_poll_cntl = REG_SET_FIELD(wptr_poll_cntl,
- 

[PATCH v2 4/4] drm/amdgpu/sdma5.2: implement ring reset callback for sdma5.2

2024-09-26 Thread jiadong.zhu
From: Jiadong Zhu 

Implement sdma queue reset callback via MMIO.

v2: enter/exit safemode for mmio queue reset.

Signed-off-by: Jiadong Zhu 
---
 drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c | 91 ++
 1 file changed, 91 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c 
b/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c
index 37534c6f431f..8e8f8be01539 100644
--- a/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c
+++ b/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c
@@ -1424,6 +1424,96 @@ static int sdma_v5_2_wait_for_idle(void *handle)
return -ETIMEDOUT;
 }
 
+static int sdma_v5_2_reset_queue(struct amdgpu_ring *ring, unsigned int vmid)
+{
+   struct amdgpu_device *adev = ring->adev;
+   int i, j, r;
+   u32 rb_cntl, ib_cntl, f32_cntl, freeze, cntl, preempt, soft_reset, 
stat1_reg;
+
+   if (amdgpu_sriov_vf(adev))
+   return -EINVAL;
+
+   for (i = 0; i < adev->sdma.num_instances; i++) {
+   if (ring == &adev->sdma.instance[i].ring)
+   break;
+   }
+
+   if (i == adev->sdma.num_instances) {
+   DRM_ERROR("sdma instance not found\n");
+   return -EINVAL;
+   }
+
+   amdgpu_gfx_rlc_enter_safe_mode(adev, 0);
+
+   /* stop queue */
+   ib_cntl = RREG32(sdma_v5_2_get_reg_offset(adev, i, 
mmSDMA0_GFX_IB_CNTL));
+   ib_cntl = REG_SET_FIELD(ib_cntl, SDMA0_GFX_IB_CNTL, IB_ENABLE, 0);
+   WREG32(sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_GFX_IB_CNTL), ib_cntl);
+
+   rb_cntl = RREG32(sdma_v5_2_get_reg_offset(adev, i, 
mmSDMA0_GFX_RB_CNTL));
+   rb_cntl = REG_SET_FIELD(rb_cntl, SDMA0_GFX_RB_CNTL, RB_ENABLE, 0);
+   WREG32(sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_GFX_RB_CNTL), rb_cntl);
+
+   /*engine stop SDMA1_F32_CNTL.HALT to 1 and SDMAx_FREEZE freeze bit to 1 
*/
+   freeze = RREG32(sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_FREEZE));
+   freeze = REG_SET_FIELD(freeze, SDMA0_FREEZE, FREEZE, 1);
+   WREG32(sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_FREEZE), freeze);
+
+   for (j = 0; j < adev->usec_timeout; j++) {
+   freeze = RREG32(sdma_v5_2_get_reg_offset(adev, i, 
mmSDMA0_FREEZE));
+
+   if (REG_GET_FIELD(freeze, SDMA0_FREEZE, FROZEN) & 1)
+   break;
+   udelay(1);
+   }
+
+
+   if (j == adev->usec_timeout) {
+   stat1_reg = RREG32(sdma_v5_2_get_reg_offset(adev, i, 
mmSDMA0_STATUS1_REG));
+   if ((stat1_reg & 0x3FF) != 0x3FF) {
+   DRM_ERROR("cannot soft reset as sdma not idle\n");
+   r = -ETIMEDOUT;
+   goto err0;
+   }
+   }
+
+   f32_cntl = RREG32(sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_F32_CNTL));
+   f32_cntl = REG_SET_FIELD(f32_cntl, SDMA0_F32_CNTL, HALT, 1);
+   WREG32(sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_F32_CNTL), f32_cntl);
+
+   cntl = RREG32(sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_CNTL));
+   cntl = REG_SET_FIELD(cntl, SDMA0_CNTL, UTC_L1_ENABLE, 0);
+   WREG32(sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_CNTL), cntl);
+
+   /* soft reset SDMA_GFX_PREEMPT.IB_PREEMPT = 0 
mmGRBM_SOFT_RESET.SOFT_RESET_SDMA0/1 = 1 */
+   preempt = RREG32(sdma_v5_2_get_reg_offset(adev, i, 
mmSDMA0_GFX_PREEMPT));
+   preempt = REG_SET_FIELD(preempt, SDMA0_GFX_PREEMPT, IB_PREEMPT, 0);
+   WREG32(sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_GFX_PREEMPT), preempt);
+
+   soft_reset = RREG32_SOC15(GC, 0, mmGRBM_SOFT_RESET);
+   soft_reset |= 1 << GRBM_SOFT_RESET__SOFT_RESET_SDMA0__SHIFT << i;
+
+
+   WREG32_SOC15(GC, 0, mmGRBM_SOFT_RESET, soft_reset);
+
+   udelay(50);
+
+   soft_reset &= ~(1 << GRBM_SOFT_RESET__SOFT_RESET_SDMA0__SHIFT << i);
+
+   WREG32_SOC15(GC, 0, mmGRBM_SOFT_RESET, soft_reset);
+
+   /* unfreeze and unhalt */
+   freeze = RREG32(sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_FREEZE));
+   freeze = REG_SET_FIELD(freeze, SDMA0_FREEZE, FREEZE, 0);
+   WREG32(sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_FREEZE), freeze);
+
+   r = sdma_v5_2_gfx_resume_instance(adev, i, true);
+
+err0:
+   amdgpu_gfx_rlc_exit_safe_mode(adev, 0);
+   return r;
+}
+
 static int sdma_v5_2_ring_preempt_ib(struct amdgpu_ring *ring)
 {
int i, r = 0;
@@ -1859,6 +1949,7 @@ static const struct amdgpu_ring_funcs 
sdma_v5_2_ring_funcs = {
.emit_reg_write_reg_wait = sdma_v5_2_ring_emit_reg_write_reg_wait,
.init_cond_exec = sdma_v5_2_ring_init_cond_exec,
.preempt_ib = sdma_v5_2_ring_preempt_ib,
+   .reset = sdma_v5_2_reset_queue,
 };
 
 static void sdma_v5_2_set_ring_funcs(struct amdgpu_device *adev)
-- 
2.25.1



Re: [PATCH v5 41/80] drm/tidss: Run DRM default client setup

2024-09-26 Thread Tomi Valkeinen

On 24/09/2024 10:12, Thomas Zimmermann wrote:

Call drm_client_setup() to run the kernel's default client setup
for DRM. Set fbdev_probe in struct drm_driver, so that the client
setup can start the common fbdev client.

The tidss driver specifies a preferred color mode of 32. As this
is the default if no format has been given, leave it out entirely.

v5:
- select DRM_CLIENT_SELECTION

Signed-off-by: Thomas Zimmermann 
Cc: Jyri Sarha 
Cc: Tomi Valkeinen 
Acked-by: Javier Martinez Canillas 
---
  drivers/gpu/drm/tidss/Kconfig | 1 +
  drivers/gpu/drm/tidss/tidss_drv.c | 4 +++-
  2 files changed, 4 insertions(+), 1 deletion(-)


Tested on AM62 SK.

Reviewed-by: Tomi Valkeinen 

 Tomi


diff --git a/drivers/gpu/drm/tidss/Kconfig b/drivers/gpu/drm/tidss/Kconfig
index 2385c56493b9..31ad582b7602 100644
--- a/drivers/gpu/drm/tidss/Kconfig
+++ b/drivers/gpu/drm/tidss/Kconfig
@@ -2,6 +2,7 @@ config DRM_TIDSS
tristate "DRM Support for TI Keystone"
depends on DRM && OF
depends on ARM || ARM64 || COMPILE_TEST
+   select DRM_CLIENT_SELECTION
select DRM_KMS_HELPER
select DRM_DISPLAY_HELPER
select DRM_BRIDGE_CONNECTOR
diff --git a/drivers/gpu/drm/tidss/tidss_drv.c 
b/drivers/gpu/drm/tidss/tidss_drv.c
index d15f836dca95..2428b9aaa003 100644
--- a/drivers/gpu/drm/tidss/tidss_drv.c
+++ b/drivers/gpu/drm/tidss/tidss_drv.c
@@ -11,6 +11,7 @@
  
  #include 

  #include 
+#include 
  #include 
  #include 
  #include 
@@ -109,6 +110,7 @@ static const struct drm_driver tidss_driver = {
.fops   = &tidss_fops,
.release= tidss_release,
DRM_GEM_DMA_DRIVER_OPS_VMAP,
+   DRM_FBDEV_DMA_DRIVER_OPS,
.name   = "tidss",
.desc   = "TI Keystone DSS",
.date   = "20180215",
@@ -186,7 +188,7 @@ static int tidss_probe(struct platform_device *pdev)
goto err_irq_uninstall;
}
  
-	drm_fbdev_dma_setup(ddev, 32);

+   drm_client_setup(ddev, NULL);
  
  	dev_dbg(dev, "%s done\n", __func__);
  




Re: [PATCH v2 2/4] drm/amdgpu/sdma5: implement ring reset callback for sdma5

2024-09-26 Thread Alex Deucher
On Thu, Sep 26, 2024 at 3:58 AM  wrote:
>
> From: Jiadong Zhu 
>
> Implement sdma queue reset callback via MMIO.
>
> v2: enter/exit safemode when sdma queue reset.
>
> Signed-off-by: Jiadong Zhu 

Reviewed-by: Alex Deucher 

> ---
>  drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c | 88 ++
>  1 file changed, 88 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c 
> b/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c
> index e813da1e48aa..416273d917e0 100644
> --- a/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c
> @@ -1555,6 +1555,93 @@ static int sdma_v5_0_soft_reset(void *handle)
> return 0;
>  }
>
> +static int sdma_v5_0_reset_queue(struct amdgpu_ring *ring, unsigned int vmid)
> +{
> +   struct amdgpu_device *adev = ring->adev;
> +   int i, j, r;
> +   u32 rb_cntl, ib_cntl, f32_cntl, freeze, cntl, preempt, soft_reset, 
> stat1_reg;
> +
> +   if (amdgpu_sriov_vf(adev))
> +   return -EINVAL;
> +
> +   for (i = 0; i < adev->sdma.num_instances; i++) {
> +   if (ring == &adev->sdma.instance[i].ring)
> +   break;
> +   }
> +
> +   if (i == adev->sdma.num_instances) {
> +   DRM_ERROR("sdma instance not found\n");
> +   return -EINVAL;
> +   }
> +
> +   amdgpu_gfx_rlc_enter_safe_mode(adev, 0);
> +
> +   /* stop queue */
> +   ib_cntl = RREG32(sdma_v5_0_get_reg_offset(adev, i, 
> mmSDMA0_GFX_IB_CNTL));
> +   ib_cntl = REG_SET_FIELD(ib_cntl, SDMA0_GFX_IB_CNTL, IB_ENABLE, 0);
> +   WREG32(sdma_v5_0_get_reg_offset(adev, i, mmSDMA0_GFX_IB_CNTL), 
> ib_cntl);
> +
> +   rb_cntl = RREG32(sdma_v5_0_get_reg_offset(adev, i, 
> mmSDMA0_GFX_RB_CNTL));
> +   rb_cntl = REG_SET_FIELD(rb_cntl, SDMA0_GFX_RB_CNTL, RB_ENABLE, 0);
> +   WREG32(sdma_v5_0_get_reg_offset(adev, i, mmSDMA0_GFX_RB_CNTL), 
> rb_cntl);
> +
> +   /* engine stop SDMA1_F32_CNTL.HALT to 1 and SDMAx_FREEZE freeze bit 
> to 1 */
> +   freeze = RREG32(sdma_v5_0_get_reg_offset(adev, i, mmSDMA0_FREEZE));
> +   freeze = REG_SET_FIELD(freeze, SDMA0_FREEZE, FREEZE, 1);
> +   WREG32(sdma_v5_0_get_reg_offset(adev, i, mmSDMA0_FREEZE), freeze);
> +
> +   for (j = 0; j < adev->usec_timeout; j++) {
> +   freeze = RREG32(sdma_v5_0_get_reg_offset(adev, i, 
> mmSDMA0_FREEZE));
> +   if (REG_GET_FIELD(freeze, SDMA0_FREEZE, FROZEN) & 1)
> +   break;
> +   udelay(1);
> +   }
> +
> +   /* check sdma copy engine all idle if frozen not received*/
> +   if (j == adev->usec_timeout) {
> +   stat1_reg = RREG32(sdma_v5_0_get_reg_offset(adev, i, 
> mmSDMA0_STATUS1_REG));
> +   if ((stat1_reg & 0x3FF) != 0x3FF) {
> +   DRM_ERROR("cannot soft reset as sdma not idle\n");
> +   r = -ETIMEDOUT;
> +   goto err0;
> +   }
> +   }
> +
> +   f32_cntl = RREG32(sdma_v5_0_get_reg_offset(adev, i, 
> mmSDMA0_F32_CNTL));
> +   f32_cntl = REG_SET_FIELD(f32_cntl, SDMA0_F32_CNTL, HALT, 1);
> +   WREG32(sdma_v5_0_get_reg_offset(adev, i, mmSDMA0_F32_CNTL), f32_cntl);
> +
> +   cntl = RREG32(sdma_v5_0_get_reg_offset(adev, i, mmSDMA0_CNTL));
> +   cntl = REG_SET_FIELD(cntl, SDMA0_CNTL, UTC_L1_ENABLE, 0);
> +   WREG32(sdma_v5_0_get_reg_offset(adev, i, mmSDMA0_CNTL), cntl);
> +
> +   /* soft reset SDMA_GFX_PREEMPT.IB_PREEMPT = 0 
> mmGRBM_SOFT_RESET.SOFT_RESET_SDMA0/1 = 1 */
> +   preempt = RREG32(sdma_v5_0_get_reg_offset(adev, i, 
> mmSDMA0_GFX_PREEMPT));
> +   preempt = REG_SET_FIELD(preempt, SDMA0_GFX_PREEMPT, IB_PREEMPT, 0);
> +   WREG32(sdma_v5_0_get_reg_offset(adev, i, mmSDMA0_GFX_PREEMPT), 
> preempt);
> +
> +   soft_reset = RREG32_SOC15(GC, 0, mmGRBM_SOFT_RESET);
> +   soft_reset |= 1 << GRBM_SOFT_RESET__SOFT_RESET_SDMA0__SHIFT << i;
> +
> +   WREG32_SOC15(GC, 0, mmGRBM_SOFT_RESET, soft_reset);
> +
> +   udelay(50);
> +
> +   soft_reset &= ~(1 << GRBM_SOFT_RESET__SOFT_RESET_SDMA0__SHIFT << i);
> +   WREG32_SOC15(GC, 0, mmGRBM_SOFT_RESET, soft_reset);
> +
> +   /* unfreeze*/
> +   freeze = RREG32(sdma_v5_0_get_reg_offset(adev, i, mmSDMA0_FREEZE));
> +   freeze = REG_SET_FIELD(freeze, SDMA0_FREEZE, FREEZE, 0);
> +   WREG32(sdma_v5_0_get_reg_offset(adev, i, mmSDMA0_FREEZE), freeze);
> +
> +   r = sdma_v5_0_gfx_resume_instance(adev, i, true);
> +
> +err0:
> +   amdgpu_gfx_rlc_exit_safe_mode(adev, 0);
> +   return r;
> +}
> +
>  static int sdma_v5_0_ring_preempt_ib(struct amdgpu_ring *ring)
>  {
> int i, r = 0;
> @@ -1897,6 +1984,7 @@ static const struct amdgpu_ring_funcs 
> sdma_v5_0_ring_funcs = {
> .emit_reg_write_reg_wait = sdma_v5_0_ring_emit_reg_write_reg_wait,
> .init_cond_exec = sdma_v5_0_ring_init_cond_exec,
> .preempt_ib = sdma_v5_0_ring_preempt_ib,
> +   .reset = sdma_v5_0_reset_queue,
>  };
>

RE: [PATCH] drm/amdgpu: simplify vram alloc logic since 2GB limitation removed

2024-09-26 Thread Zhang, Yifan
[AMD Official Use Only - AMD Internal Distribution Only]

-Original Message-
From: Alex Deucher 
Sent: Thursday, September 26, 2024 1:17 AM
To: Zhang, Yifan 
Cc: amd-gfx@lists.freedesktop.org; Yang, Philip ; 
Kuehling, Felix 
Subject: Re: [PATCH] drm/amdgpu: simplify vram alloc logic since 2GB limitation 
removed

On Tue, Sep 24, 2024 at 10:22 AM Zhang, Yifan  wrote:
>
> [AMD Official Use Only - AMD Internal Distribution Only]
>
> 2GB limitation in VRAM allocation is removed in below patch. My patch is a 
> follow up refine for this. The remaing_size calculation was to address the 
> 2GB limitation in contiguous VRAM allocation, and no longer needed after the 
> limitation is removed.
>

Thanks.  Would be good to add a reference to this commit in the commit message 
so it's clear what you are talking about and also provide a bit more of a 
description about why this can be cleaned up (like you did above).  One other 
comment below as well.

Sure, I will send patch V2 to change commit message.

> commit b2dba064c9bdd18c7dd39066d25453af28451dbf
> Author: Philip Yang 
> Date:   Fri Apr 19 16:27:00 2024 -0400
>
> drm/amdgpu: Handle sg size limit for contiguous allocation
>
> Define macro AMDGPU_MAX_SG_SEGMENT_SIZE 2GB, because struct scatterlist
> length is unsigned int, and some users of it cast to a signed int, so
> every segment of sg table is limited to size 2GB maximum.
>
> For contiguous VRAM allocation, don't limit the max buddy block size in
> order to get contiguous VRAM memory. To workaround the sg table segment
> size limit, allocate multiple segments if contiguous size is bigger than
> AMDGPU_MAX_SG_SEGMENT_SIZE.
>
> Signed-off-by: Philip Yang 
> Reviewed-by: Christian König 
> Signed-off-by: Alex Deucher 
>
>
>
> -Original Message-
> From: Alex Deucher 
> Sent: Tuesday, September 24, 2024 10:07 PM
> To: Zhang, Yifan 
> Cc: amd-gfx@lists.freedesktop.org; Yang, Philip ;
> Kuehling, Felix 
> Subject: Re: [PATCH] drm/amdgpu: simplify vram alloc logic since 2GB
> limitation removed
>
> On Mon, Sep 23, 2024 at 4:28 AM Yifan Zhang  wrote:
> >
> > Make vram alloc loop simpler after 2GB limitation removed.
>
> Can you provide more context?  What 2GB limitation are you referring to?
>
> Alex
>
> >
> > Signed-off-by: Yifan Zhang 
> > ---
> >  drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c | 15 +--
> >  1 file changed, 5 insertions(+), 10 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
> > index 7d26a962f811..3d129fd61fa7 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
> > @@ -455,7 +455,7 @@ static int amdgpu_vram_mgr_new(struct 
> > ttm_resource_manager *man,
> > struct amdgpu_bo *bo = ttm_to_amdgpu_bo(tbo);
> > u64 vis_usage = 0, max_bytes, min_block_size;
> > struct amdgpu_vram_mgr_resource *vres;
> > -   u64 size, remaining_size, lpfn, fpfn;
> > +   u64 size, lpfn, fpfn;
> > unsigned int adjust_dcc_size = 0;
> > struct drm_buddy *mm = &mgr->mm;
> > struct drm_buddy_block *block; @@ -516,25 +516,23 @@ static
> > int amdgpu_vram_mgr_new(struct ttm_resource_manager *man,
> > adev->gmc.gmc_funcs->get_dcc_alignment)
> > adjust_dcc_size =
> > amdgpu_gmc_get_dcc_alignment(adev);
> >
> > -   remaining_size = (u64)vres->base.size;
> > +   size = (u64)vres->base.size;
> > if (bo->flags & AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS && 
> > adjust_dcc_size) {
> > unsigned int dcc_size;
> >
> > dcc_size = roundup_pow_of_two(vres->base.size + 
> > adjust_dcc_size);
> > -   remaining_size = (u64)dcc_size;
> > +   size = (u64)dcc_size;
> >
> > vres->flags |= DRM_BUDDY_TRIM_DISABLE;
> > }
> >
> > mutex_lock(&mgr->lock);
> > -   while (remaining_size) {
> > +   while (true) {

I think you can also drop this while loop now too.

Alex

There is still a "continue" path left in the loop, I think the logic may be 
broken here if loop dropped.

   r = drm_buddy_alloc_blocks(mm, fpfn,
lpfn,
size,
min_block_size,
&vres->blocks,
vres->flags);

 if (unlikely(r == -ENOSPC) && pages_per_block == ~0ul &&
 !(place->flags & TTM_PL_FLAG_CONTIGUOUS)) {
 vres->flags &= ~DRM_BUDDY_CONTIGUOUS_ALLOCATION;
 pages_per_block = max_t(u32, 2UL << (20UL - PAGE_SHIFT),
 tbo->page_alignment);

 continue; /* continue in the loop */
 }


> > if (tbo->page_alignment)
> > min_block_size = (u64)tbo->page_alignment << 
> > PAGE_SHIFT;
> > else
> > min_block_size = mgr->d

Re: [PATCH v1 5/9] drm/amd/pm: use pm_runtime_get_if_active for debugfs getters

2024-09-26 Thread Lazar, Lijo



On 9/26/2024 5:56 PM, Pierre-Eric Pelloux-Prayer wrote:
> 
> 
> Le 26/09/2024 à 11:27, Lazar, Lijo a écrit :
>>
>>
>> On 9/25/2024 1:24 PM, Pierre-Eric Pelloux-Prayer wrote:
>>> Don't wake up the GPU for reading pm values. Instead, take a runtime
>>> powermanagement ref when trying to read it iff the GPU is already
>>> awake.
>>>
>>> This avoids spurious wake ups (eg: from applets).
>>>
>>> We use pm_runtime_get_if_in_active(ign_usage_count=true) because we care
>>> about "is the GPU awake?" not about "is the GPU awake and something else
>>> prevents suspend?".
>>
>> One possible downside of this approach is - let's say an application
>> tries this way on a BACO enabled device (device is visible on bus, but
>> powered off due to runtime PM)
>>
>> Get current clock level
>> If (success) && (not desired clock level)
>>     Set clock level
>>     Submit some jobs to run at set clock level
>>
>> This type of approach won't work since get clock level() itself will
>> fail. That said, I don't know if there is any app out there does
>> something like this.
> 
> Yes, this would break this pattern but I don't see why it would be used.
> 
> If this theoretical concern becomes real because an application reports
> a breakage after this series is merged, we can figure out how to best
> deal with it.
> 

Right, will see if Alex/Kenneth has any comments.

That apart, the series looks good to me.

Thanks,
Lijo

> Pierr-Eric
> 
> 
>>
>> Thanks,
>> Lijo
>>
>>>
>>> Tested-by: Mario Limonciello 
>>> Signed-off-by: Pierre-Eric Pelloux-Prayer
>>> 
>>> ---
>>>   drivers/gpu/drm/amd/pm/amdgpu_pm.c | 138 ++---
>>>   1 file changed, 69 insertions(+), 69 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/pm/amdgpu_pm.c
>>> b/drivers/gpu/drm/amd/pm/amdgpu_pm.c
>>> index c8f34b1a4d8e..f1f339b75380 100644
>>> --- a/drivers/gpu/drm/amd/pm/amdgpu_pm.c
>>> +++ b/drivers/gpu/drm/amd/pm/amdgpu_pm.c
>>> @@ -145,9 +145,9 @@ static ssize_t amdgpu_get_power_dpm_state(struct
>>> device *dev,
>>>   if (adev->in_suspend && !adev->in_runpm)
>>>   return -EPERM;
>>>   -    ret = pm_runtime_resume_and_get(ddev->dev);
>>> -    if (ret < 0)
>>> -    return ret;
>>> +    ret = pm_runtime_get_if_active(ddev->dev, true);
>>> +    if (ret <= 0)
>>> +    return ret ?: -EPERM;
>>>     amdgpu_dpm_get_current_power_state(adev, &pm);
>>>   @@ -268,9 +268,9 @@ static ssize_t
>>> amdgpu_get_power_dpm_force_performance_level(struct device *dev,
>>>   if (adev->in_suspend && !adev->in_runpm)
>>>   return -EPERM;
>>>   -    ret = pm_runtime_resume_and_get(ddev->dev);
>>> -    if (ret < 0)
>>> -    return ret;
>>> +    ret = pm_runtime_get_if_active(ddev->dev, true);
>>> +    if (ret <= 0)
>>> +    return ret ?: -EPERM;
>>>     level = amdgpu_dpm_get_performance_level(adev);
>>>   @@ -364,9 +364,9 @@ static ssize_t amdgpu_get_pp_num_states(struct
>>> device *dev,
>>>   if (adev->in_suspend && !adev->in_runpm)
>>>   return -EPERM;
>>>   -    ret = pm_runtime_resume_and_get(ddev->dev);
>>> -    if (ret < 0)
>>> -    return ret;
>>> +    ret = pm_runtime_get_if_active(ddev->dev, true);
>>> +    if (ret <= 0)
>>> +    return ret ?: -EPERM;
>>>     if (amdgpu_dpm_get_pp_num_states(adev, &data))
>>>   memset(&data, 0, sizeof(data));
>>> @@ -399,9 +399,9 @@ static ssize_t amdgpu_get_pp_cur_state(struct
>>> device *dev,
>>>   if (adev->in_suspend && !adev->in_runpm)
>>>   return -EPERM;
>>>   -    ret = pm_runtime_resume_and_get(ddev->dev);
>>> -    if (ret < 0)
>>> -    return ret;
>>> +    ret = pm_runtime_get_if_active(ddev->dev, true);
>>> +    if (ret <= 0)
>>> +    return ret ?: -EPERM;
>>>     amdgpu_dpm_get_current_power_state(adev, &pm);
>>>   @@ -526,9 +526,9 @@ static ssize_t amdgpu_get_pp_table(struct
>>> device *dev,
>>>   if (adev->in_suspend && !adev->in_runpm)
>>>   return -EPERM;
>>>   -    ret = pm_runtime_resume_and_get(ddev->dev);
>>> -    if (ret < 0)
>>> -    return ret;
>>> +    ret = pm_runtime_get_if_active(ddev->dev, true);
>>> +    if (ret <= 0)
>>> +    return ret ?: -EPERM;
>>>     size = amdgpu_dpm_get_pp_table(adev, &table);
>>>   @@ -840,9 +840,9 @@ static ssize_t
>>> amdgpu_get_pp_od_clk_voltage(struct device *dev,
>>>   if (adev->in_suspend && !adev->in_runpm)
>>>   return -EPERM;
>>>   -    ret = pm_runtime_resume_and_get(ddev->dev);
>>> -    if (ret < 0)
>>> -    return ret;
>>> +    ret = pm_runtime_get_if_active(ddev->dev, true);
>>> +    if (ret <= 0)
>>> +    return ret ?: -EPERM;
>>>     for (clk_index = 0 ; clk_index < 6 ; clk_index++) {
>>>   ret = amdgpu_dpm_emit_clock_levels(adev,
>>> od_clocks[clk_index], buf, &size);
>>> @@ -930,9 +930,9 @@ static ssize_t amdgpu_get_pp_features(struct
>>> device *dev,
>>>   if (adev->in_suspend && !adev->in_runpm)
>>>   return -EPERM;
>>>   -    ret = pm_runtime_resume_and_get(dde

Re: [PATCH] drm/amdgpu: fix vbios fetching for SR-IOV

2024-09-26 Thread Alex Deucher
On Thu, Sep 26, 2024 at 7:18 AM Lazar, Lijo  wrote:
>
>
>
> On 9/25/2024 11:50 PM, Alex Deucher wrote:
> > SR-IOV fetches the vbios from VRAM in some cases.
> > Re-enable the VRAM path for dGPUs and rename the function
> > to make it clear that it is not IGP specific.
> >
> > Fixes: 042658d17a54 ("drm/amdgpu: clean up vbios fetching code")
> > Signed-off-by: Alex Deucher 
> > ---
> >  drivers/gpu/drm/amd/amdgpu/amdgpu_bios.c | 11 +--
> >  1 file changed, 9 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_bios.c 
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_bios.c
> > index 46bf623919d7c..45affc02548c1 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_bios.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_bios.c
> > @@ -87,8 +87,9 @@ static bool check_atom_bios(uint8_t *bios, size_t size)
> >   * part of the system bios.  On boot, the system bios puts a
> >   * copy of the igp rom at the start of vram if a discrete card is
> >   * present.
> > + * For SR-IOV, the vbios image is also put in VRAM in the VF.
> >   */
> > -static bool igp_read_bios_from_vram(struct amdgpu_device *adev)
> > +static bool amdgpu_read_bios_from_vram(struct amdgpu_device *adev)
> >  {
> >   uint8_t __iomem *bios;
> >   resource_size_t vram_base;
> > @@ -414,7 +415,7 @@ static bool amdgpu_get_bios_apu(struct amdgpu_device 
> > *adev)
> >   goto success;
> >   }
> >
> > - if (igp_read_bios_from_vram(adev)) {
> > + if (amdgpu_read_bios_from_vram(adev)) {
> >   dev_info(adev->dev, "Fetched VBIOS from VRAM BAR\n");
> >   goto success;
> >   }
> > @@ -448,6 +449,12 @@ static bool amdgpu_get_bios_dgpu(struct amdgpu_device 
> > *adev)
> >   goto success;
> >   }
> >
> > + /* this is required for SR-IOV */
> > + if (amdgpu_read_bios_from_vram(adev)) {
>
> If this is only expected for VFs, then it's better to add
> amdgpu_sriov_vf(adev) check for this call.

I think that is the case, but I'm not 100% sure.  I can ask the vbios team.

Alex


>
> Thanks,
> Lijo
>
> > + dev_info(adev->dev, "Fetched VBIOS from VRAM BAR\n");
> > + goto success;
> > + }
> > +
> >   if (amdgpu_read_platform_bios(adev)) {
> >   dev_info(adev->dev, "Fetched VBIOS from platform\n");
> >   goto success;


Re: [PATCH v2 4/4] drm/amdgpu/sdma5.2: implement ring reset callback for sdma5.2

2024-09-26 Thread Alex Deucher
On Thu, Sep 26, 2024 at 3:48 AM  wrote:
>
> From: Jiadong Zhu 
>
> Implement sdma queue reset callback via MMIO.
>
> v2: enter/exit safemode for mmio queue reset.
>
> Signed-off-by: Jiadong Zhu 

Reviewed-by: Alex Deucher 

> ---
>  drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c | 91 ++
>  1 file changed, 91 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c 
> b/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c
> index 37534c6f431f..8e8f8be01539 100644
> --- a/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c
> +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c
> @@ -1424,6 +1424,96 @@ static int sdma_v5_2_wait_for_idle(void *handle)
> return -ETIMEDOUT;
>  }
>
> +static int sdma_v5_2_reset_queue(struct amdgpu_ring *ring, unsigned int vmid)
> +{
> +   struct amdgpu_device *adev = ring->adev;
> +   int i, j, r;
> +   u32 rb_cntl, ib_cntl, f32_cntl, freeze, cntl, preempt, soft_reset, 
> stat1_reg;
> +
> +   if (amdgpu_sriov_vf(adev))
> +   return -EINVAL;
> +
> +   for (i = 0; i < adev->sdma.num_instances; i++) {
> +   if (ring == &adev->sdma.instance[i].ring)
> +   break;
> +   }
> +
> +   if (i == adev->sdma.num_instances) {
> +   DRM_ERROR("sdma instance not found\n");
> +   return -EINVAL;
> +   }
> +
> +   amdgpu_gfx_rlc_enter_safe_mode(adev, 0);
> +
> +   /* stop queue */
> +   ib_cntl = RREG32(sdma_v5_2_get_reg_offset(adev, i, 
> mmSDMA0_GFX_IB_CNTL));
> +   ib_cntl = REG_SET_FIELD(ib_cntl, SDMA0_GFX_IB_CNTL, IB_ENABLE, 0);
> +   WREG32(sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_GFX_IB_CNTL), 
> ib_cntl);
> +
> +   rb_cntl = RREG32(sdma_v5_2_get_reg_offset(adev, i, 
> mmSDMA0_GFX_RB_CNTL));
> +   rb_cntl = REG_SET_FIELD(rb_cntl, SDMA0_GFX_RB_CNTL, RB_ENABLE, 0);
> +   WREG32(sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_GFX_RB_CNTL), 
> rb_cntl);
> +
> +   /*engine stop SDMA1_F32_CNTL.HALT to 1 and SDMAx_FREEZE freeze bit to 
> 1 */
> +   freeze = RREG32(sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_FREEZE));
> +   freeze = REG_SET_FIELD(freeze, SDMA0_FREEZE, FREEZE, 1);
> +   WREG32(sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_FREEZE), freeze);
> +
> +   for (j = 0; j < adev->usec_timeout; j++) {
> +   freeze = RREG32(sdma_v5_2_get_reg_offset(adev, i, 
> mmSDMA0_FREEZE));
> +
> +   if (REG_GET_FIELD(freeze, SDMA0_FREEZE, FROZEN) & 1)
> +   break;
> +   udelay(1);
> +   }
> +
> +
> +   if (j == adev->usec_timeout) {
> +   stat1_reg = RREG32(sdma_v5_2_get_reg_offset(adev, i, 
> mmSDMA0_STATUS1_REG));
> +   if ((stat1_reg & 0x3FF) != 0x3FF) {
> +   DRM_ERROR("cannot soft reset as sdma not idle\n");
> +   r = -ETIMEDOUT;
> +   goto err0;
> +   }
> +   }
> +
> +   f32_cntl = RREG32(sdma_v5_2_get_reg_offset(adev, i, 
> mmSDMA0_F32_CNTL));
> +   f32_cntl = REG_SET_FIELD(f32_cntl, SDMA0_F32_CNTL, HALT, 1);
> +   WREG32(sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_F32_CNTL), f32_cntl);
> +
> +   cntl = RREG32(sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_CNTL));
> +   cntl = REG_SET_FIELD(cntl, SDMA0_CNTL, UTC_L1_ENABLE, 0);
> +   WREG32(sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_CNTL), cntl);
> +
> +   /* soft reset SDMA_GFX_PREEMPT.IB_PREEMPT = 0 
> mmGRBM_SOFT_RESET.SOFT_RESET_SDMA0/1 = 1 */
> +   preempt = RREG32(sdma_v5_2_get_reg_offset(adev, i, 
> mmSDMA0_GFX_PREEMPT));
> +   preempt = REG_SET_FIELD(preempt, SDMA0_GFX_PREEMPT, IB_PREEMPT, 0);
> +   WREG32(sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_GFX_PREEMPT), 
> preempt);
> +
> +   soft_reset = RREG32_SOC15(GC, 0, mmGRBM_SOFT_RESET);
> +   soft_reset |= 1 << GRBM_SOFT_RESET__SOFT_RESET_SDMA0__SHIFT << i;
> +
> +
> +   WREG32_SOC15(GC, 0, mmGRBM_SOFT_RESET, soft_reset);
> +
> +   udelay(50);
> +
> +   soft_reset &= ~(1 << GRBM_SOFT_RESET__SOFT_RESET_SDMA0__SHIFT << i);
> +
> +   WREG32_SOC15(GC, 0, mmGRBM_SOFT_RESET, soft_reset);
> +
> +   /* unfreeze and unhalt */
> +   freeze = RREG32(sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_FREEZE));
> +   freeze = REG_SET_FIELD(freeze, SDMA0_FREEZE, FREEZE, 0);
> +   WREG32(sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_FREEZE), freeze);
> +
> +   r = sdma_v5_2_gfx_resume_instance(adev, i, true);
> +
> +err0:
> +   amdgpu_gfx_rlc_exit_safe_mode(adev, 0);
> +   return r;
> +}
> +
>  static int sdma_v5_2_ring_preempt_ib(struct amdgpu_ring *ring)
>  {
> int i, r = 0;
> @@ -1859,6 +1949,7 @@ static const struct amdgpu_ring_funcs 
> sdma_v5_2_ring_funcs = {
> .emit_reg_write_reg_wait = sdma_v5_2_ring_emit_reg_write_reg_wait,
> .init_cond_exec = sdma_v5_2_ring_init_cond_exec,
> .preempt_ib = sdma_v5_2_ring_preempt_ib,
> +   .reset = sdma_v5_2_reset_queue,
>  };
>
>  static void sdma_v5_2_set_ri

RE: [PATCH v3] drm/amd/display: Fix out-of-bounds access in 'dcn21_link_encoder_create'

2024-09-26 Thread Li, Roman
[Public]

Reviewed-by: Roman Li 

> -Original Message-
> From: SHANMUGAM, SRINIVASAN 
> Sent: Wednesday, September 25, 2024 11:10 AM
> To: Siqueira, Rodrigo ; Pillai, Aurabindo
> 
> Cc: amd-gfx@lists.freedesktop.org; SHANMUGAM, SRINIVASAN
> ; Chung, ChiaHsuan (Tom)
> ; Li, Roman ; Hung, Alex
> ; Wentland, Harry ; Mahfooz,
> Hamza 
> Subject: [PATCH v3] drm/amd/display: Fix out-of-bounds access in
> 'dcn21_link_encoder_create'
>
> An issue was identified in the dcn21_link_encoder_create function where an 
> out-of-
> bounds access could occur when the hpd_source index was used to reference the
> link_enc_hpd_regs array. This array has a fixed size and the index was not 
> being
> checked against the array's bounds before accessing it.
>
> This fix adds a conditional check to ensure that the hpd_source index is 
> within the
> valid range of the link_enc_hpd_regs array. If the index is out of bounds, 
> the function
> now returns NULL to prevent undefined behavior.
>
> References:
>
> [   65.920507] [ cut here ]
> [   65.920510] UBSAN: array-index-out-of-bounds in
> drivers/gpu/drm/amd/amdgpu/../display/dc/resource/dcn21/dcn21_resource.c:1312:
> 29
> [   65.920519] index 7 is out of range for type 'dcn10_link_enc_hpd_registers 
> [5]'
> [   65.920523] CPU: 3 PID: 1178 Comm: modprobe Tainted: G   OE  
> 6.8.0-
> cleanershaderfeatureresetasdntipmi200nv2132 #13
> [   65.920525] Hardware name: AMD Majolica-RN/Majolica-RN, BIOS
> WMJ0429N_Weekly_20_04_2 04/29/2020
> [   65.920527] Call Trace:
> [   65.920529]  
> [   65.920532]  dump_stack_lvl+0x48/0x70
> [   65.920541]  dump_stack+0x10/0x20
> [   65.920543]  __ubsan_handle_out_of_bounds+0xa2/0xe0
> [   65.920549]  dcn21_link_encoder_create+0xd9/0x140 [amdgpu]
> [   65.921009]  link_create+0x6d3/0xed0 [amdgpu]
> [   65.921355]  create_links+0x18a/0x4e0 [amdgpu]
> [   65.921679]  dc_create+0x360/0x720 [amdgpu]
> [   65.921999]  ? dmi_matches+0xa0/0x220
> [   65.922004]  amdgpu_dm_init+0x2b6/0x2c90 [amdgpu]
> [   65.922342]  ? console_unlock+0x77/0x120
> [   65.922348]  ? dev_printk_emit+0x86/0xb0
> [   65.922354]  dm_hw_init+0x15/0x40 [amdgpu]
> [   65.922686]  amdgpu_device_init+0x26a8/0x33a0 [amdgpu]
> [   65.922921]  amdgpu_driver_load_kms+0x1b/0xa0 [amdgpu]
> [   65.923087]  amdgpu_pci_probe+0x1b7/0x630 [amdgpu]
> [   65.923087]  local_pci_probe+0x4b/0xb0
> [   65.923087]  pci_device_probe+0xc8/0x280
> [   65.923087]  really_probe+0x187/0x300
> [   65.923087]  __driver_probe_device+0x85/0x130
> [   65.923087]  driver_probe_device+0x24/0x110
> [   65.923087]  __driver_attach+0xac/0x1d0
> [   65.923087]  ? __pfx___driver_attach+0x10/0x10
> [   65.923087]  bus_for_each_dev+0x7d/0xd0
> [   65.923087]  driver_attach+0x1e/0x30
> [   65.923087]  bus_add_driver+0xf2/0x200
> [   65.923087]  driver_register+0x64/0x130
> [   65.923087]  ? __pfx_amdgpu_init+0x10/0x10 [amdgpu]
> [   65.923087]  __pci_register_driver+0x61/0x70
> [   65.923087]  amdgpu_init+0x7d/0xff0 [amdgpu]
> [   65.923087]  do_one_initcall+0x49/0x310
> [   65.923087]  ? kmalloc_trace+0x136/0x360
> [   65.923087]  do_init_module+0x6a/0x270
> [   65.923087]  load_module+0x1fce/0x23a0
> [   65.923087]  init_module_from_file+0x9c/0xe0
> [   65.923087]  ? init_module_from_file+0x9c/0xe0
> [   65.923087]  idempotent_init_module+0x179/0x230
> [   65.923087]  __x64_sys_finit_module+0x5d/0xa0
> [   65.923087]  do_syscall_64+0x76/0x120
> [   65.923087]  entry_SYSCALL_64_after_hwframe+0x6e/0x76
> [   65.923087] RIP: 0033:0x7f2d80f1e88d
> [   65.923087] Code: 5b 41 5c c3 66 0f 1f 84 00 00 00 00 00 f3 0f 1e fa 48 89 
> f8 48
> 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 
> ff ff
> 73 01 c3 48 8b 0d 73 b5 0f 00 f7 d8 64 89 01 48
> [   65.923087] RSP: 002b:7ffc7bc1aa78 EFLAGS: 0246 ORIG_RAX:
> 0139
> [   65.923087] RAX: ffda RBX: 564c9c1db130 RCX: 
> 7f2d80f1e88d
> [   65.923087] RDX:  RSI: 564c9c1e5480 RDI:
> 000f
> [   65.923087] RBP: 0004 R08:  R09:
> 0002
> [   65.923087] R10: 000f R11: 0246 R12:
> 564c9c1e5480
> [   65.923087] R13: 564c9c1db260 R14:  R15:
> 564c9c1e54b0
> [   65.923087]  
> [   65.923927] ---[ end trace ]---
>
> Cc: Tom Chung 
> Cc: Rodrigo Siqueira 
> Cc: Roman Li 
> Cc: Alex Hung 
> Cc: Aurabindo Pillai 
> Cc: Harry Wentland 
> Cc: Hamza Mahfooz 
> Signed-off-by: Srinivasan Shanmugam 
> ---
> v2:
>  - Changed to ARRAY_SIZE() to be generic (Roman)
>
> v3:
>  - Updated git commit message
>
>  drivers/gpu/drm/amd/display/dc/resource/dcn21/dcn21_resource.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/display/dc/resource/dcn21/dcn21_resource.c
> b/drivers/gpu/drm/amd/display/dc/resource/dcn21/dcn21_resource.c
> index 347e6aaea582..14b28841657d 100644
> --- a/drivers/gpu/drm/amd/display/dc/resource/dcn21/dcn21_resourc

Re: [PATCH] drm/amdgpu: bump driver version for cleared VRAM

2024-09-26 Thread Marek Olšák
Reviewed-by: Marek Olšák 

Marek

On Thu, Sep 26, 2024 at 10:02 AM Alex Deucher  wrote:
>
> On Thu, Sep 26, 2024 at 10:00 AM Marek Olšák  wrote:
> >
> > Is GTT cleared too?
>
> GTT has always been cleared since it's system memory.
>
> Alex
>
>
> >
> > Marek
> >
> > On Thu, Sep 26, 2024, 09:53 Alex Deucher  wrote:
> >>
> >> Ping?
> >>
> >> On Fri, Sep 6, 2024 at 2:04 PM Alex Deucher  
> >> wrote:
> >> >
> >> > Driver now clears VRAM on allocation.  Bump the
> >> > driver version so mesa knows when it will get
> >> > cleared vram by default.
> >> >
> >> > Signed-off-by: Alex Deucher 
> >> > ---
> >> >  drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 3 ++-
> >> >  1 file changed, 2 insertions(+), 1 deletion(-)
> >> >
> >> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
> >> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> >> > index d90473aec942..fad556be840b 100644
> >> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> >> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> >> > @@ -117,9 +117,10 @@
> >> >   * - 3.56.0 - Update IB start address and size alignment for decode and 
> >> > encode
> >> >   * - 3.57.0 - Compute tunneling on GFX10+
> >> >   * - 3.58.0 - Add GFX12 DCC support
> >> > + * - 3.59.0 - Cleared VRAM
> >> >   */
> >> >  #define KMS_DRIVER_MAJOR   3
> >> > -#define KMS_DRIVER_MINOR   58
> >> > +#define KMS_DRIVER_MINOR   59
> >> >  #define KMS_DRIVER_PATCHLEVEL  0
> >> >
> >> >  /*
> >> > --
> >> > 2.46.0
> >> >


Re: [PATCH v2 03/08] drm/amdgpu: Add wait IOCTL timeline syncobj support

2024-09-26 Thread Christian König

Am 25.09.24 um 21:59 schrieb Arunpravin Paneer Selvam:

Add user fence wait IOCTL timeline syncobj support.

v2:(Christian)
   - handle dma_fence_wait() return value.
   - shorten the variable name syncobj_timeline_points a bit.
   - move num_points up to avoid padding issues.

Signed-off-by: Arunpravin Paneer Selvam 
---
  .../gpu/drm/amd/amdgpu/amdgpu_userq_fence.c   | 82 +--
  include/uapi/drm/amdgpu_drm.h | 16 +++-
  2 files changed, 88 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_userq_fence.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_userq_fence.c
index 97b1af574407..2465ca307644 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_userq_fence.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_userq_fence.c
@@ -474,11 +474,11 @@ int amdgpu_userq_signal_ioctl(struct drm_device *dev, 
void *data,
  int amdgpu_userq_wait_ioctl(struct drm_device *dev, void *data,
struct drm_file *filp)
  {
+   u32 *syncobj_handles, *timeline_points, *timeline_handles, *bo_handles;
+   u32 num_syncobj, num_bo_handles, num_points;
struct drm_amdgpu_userq_fence_info *fence_info = NULL;
struct drm_amdgpu_userq_wait *wait_info = data;
-   u32 *syncobj_handles, *bo_handles;
struct dma_fence **fences = NULL;
-   u32 num_syncobj, num_bo_handles;
struct drm_gem_object **gobj;
struct drm_exec exec;
int r, i, entry, cnt;
@@ -498,11 +498,26 @@ int amdgpu_userq_wait_ioctl(struct drm_device *dev, void 
*data,
goto free_bo_handles;
}
  
+	num_points = wait_info->num_points;

+   timeline_handles = 
memdup_user(u64_to_user_ptr(wait_info->syncobj_timeline_handles),
+  sizeof(u32) * num_points);
+   if (IS_ERR(timeline_handles)) {
+   r = PTR_ERR(timeline_handles);
+   goto free_syncobj_handles;
+   }
+
+   timeline_points = 
memdup_user(u64_to_user_ptr(wait_info->syncobj_timeline_points),
+ sizeof(u32) * num_points);
+   if (IS_ERR(timeline_points)) {
+   r = PTR_ERR(timeline_points);
+   goto free_timeline_handles;
+   }
+
/* Array of GEM object handles */
gobj = kmalloc_array(num_bo_handles, sizeof(*gobj), GFP_KERNEL);
if (!gobj) {
r = -ENOMEM;
-   goto free_syncobj_handles;
+   goto free_timeline_points;
}
  
  	for (entry = 0; entry < num_bo_handles; entry++) {

@@ -524,17 +539,34 @@ int amdgpu_userq_wait_ioctl(struct drm_device *dev, void 
*data,
}
  
  	if (!wait_info->num_fences) {

+   if (num_points) {
+   struct dma_fence *fence;
++
+   for (i = 0; i < num_points; i++) {
+   r = drm_syncobj_find_fence(filp, 
timeline_handles[i],
+  timeline_points[i],
+  
DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT,
+  &fence);
+   if (r || !fence)


You can't simply ignore errors here. That needs some goto error handling 
and cleanup.



+   continue;
+
+   dma_fence_put(fence);
+   num_fences++;


There can be more than one fence in the timeline we need to wait for. 
You should probably use dma_fence_unwrap_for_each() here.



+   }
+   }
+
/* Count syncobj's fence */
for (i = 0; i < num_syncobj; i++) {
struct dma_fence *fence;
  
  			r = drm_syncobj_find_fence(filp, syncobj_handles[i],

-  0, 0, &fence);
-   dma_fence_put(fence);
-
+  0,
+  
DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT,
+  &fence);
if (r || !fence)
continue;
  
+			dma_fence_put(fence);

num_fences++;
}
  
@@ -589,12 +621,34 @@ int amdgpu_userq_wait_ioctl(struct drm_device *dev, void *data,

}
}
  
+		if (num_points) {

+   struct dma_fence *fence;
+
+   for (i = 0; i < num_points; i++) {
+   r = drm_syncobj_find_fence(filp, 
timeline_handles[i],
+  timeline_points[i],
+  
DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT,
+  &fence);
+   if (r || !fence)


Same here, 

Re: [PATCH v1 5/9] drm/amd/pm: use pm_runtime_get_if_active for debugfs getters

2024-09-26 Thread Pierre-Eric Pelloux-Prayer




Le 26/09/2024 à 11:27, Lazar, Lijo a écrit :



On 9/25/2024 1:24 PM, Pierre-Eric Pelloux-Prayer wrote:

Don't wake up the GPU for reading pm values. Instead, take a runtime
powermanagement ref when trying to read it iff the GPU is already
awake.

This avoids spurious wake ups (eg: from applets).

We use pm_runtime_get_if_in_active(ign_usage_count=true) because we care
about "is the GPU awake?" not about "is the GPU awake and something else
prevents suspend?".


One possible downside of this approach is - let's say an application
tries this way on a BACO enabled device (device is visible on bus, but
powered off due to runtime PM)

Get current clock level
If (success) && (not desired clock level)
Set clock level
Submit some jobs to run at set clock level

This type of approach won't work since get clock level() itself will
fail. That said, I don't know if there is any app out there does
something like this.


Yes, this would break this pattern but I don't see why it would be used.

If this theoretical concern becomes real because an application reports a breakage after this series 
is merged, we can figure out how to best deal with it.


Pierr-Eric




Thanks,
Lijo



Tested-by: Mario Limonciello 
Signed-off-by: Pierre-Eric Pelloux-Prayer 
---
  drivers/gpu/drm/amd/pm/amdgpu_pm.c | 138 ++---
  1 file changed, 69 insertions(+), 69 deletions(-)

diff --git a/drivers/gpu/drm/amd/pm/amdgpu_pm.c 
b/drivers/gpu/drm/amd/pm/amdgpu_pm.c
index c8f34b1a4d8e..f1f339b75380 100644
--- a/drivers/gpu/drm/amd/pm/amdgpu_pm.c
+++ b/drivers/gpu/drm/amd/pm/amdgpu_pm.c
@@ -145,9 +145,9 @@ static ssize_t amdgpu_get_power_dpm_state(struct device 
*dev,
if (adev->in_suspend && !adev->in_runpm)
return -EPERM;
  
-	ret = pm_runtime_resume_and_get(ddev->dev);

-   if (ret < 0)
-   return ret;
+   ret = pm_runtime_get_if_active(ddev->dev, true);
+   if (ret <= 0)
+   return ret ?: -EPERM;
  
  	amdgpu_dpm_get_current_power_state(adev, &pm);
  
@@ -268,9 +268,9 @@ static ssize_t amdgpu_get_power_dpm_force_performance_level(struct device *dev,

if (adev->in_suspend && !adev->in_runpm)
return -EPERM;
  
-	ret = pm_runtime_resume_and_get(ddev->dev);

-   if (ret < 0)
-   return ret;
+   ret = pm_runtime_get_if_active(ddev->dev, true);
+   if (ret <= 0)
+   return ret ?: -EPERM;
  
  	level = amdgpu_dpm_get_performance_level(adev);
  
@@ -364,9 +364,9 @@ static ssize_t amdgpu_get_pp_num_states(struct device *dev,

if (adev->in_suspend && !adev->in_runpm)
return -EPERM;
  
-	ret = pm_runtime_resume_and_get(ddev->dev);

-   if (ret < 0)
-   return ret;
+   ret = pm_runtime_get_if_active(ddev->dev, true);
+   if (ret <= 0)
+   return ret ?: -EPERM;
  
  	if (amdgpu_dpm_get_pp_num_states(adev, &data))

memset(&data, 0, sizeof(data));
@@ -399,9 +399,9 @@ static ssize_t amdgpu_get_pp_cur_state(struct device *dev,
if (adev->in_suspend && !adev->in_runpm)
return -EPERM;
  
-	ret = pm_runtime_resume_and_get(ddev->dev);

-   if (ret < 0)
-   return ret;
+   ret = pm_runtime_get_if_active(ddev->dev, true);
+   if (ret <= 0)
+   return ret ?: -EPERM;
  
  	amdgpu_dpm_get_current_power_state(adev, &pm);
  
@@ -526,9 +526,9 @@ static ssize_t amdgpu_get_pp_table(struct device *dev,

if (adev->in_suspend && !adev->in_runpm)
return -EPERM;
  
-	ret = pm_runtime_resume_and_get(ddev->dev);

-   if (ret < 0)
-   return ret;
+   ret = pm_runtime_get_if_active(ddev->dev, true);
+   if (ret <= 0)
+   return ret ?: -EPERM;
  
  	size = amdgpu_dpm_get_pp_table(adev, &table);
  
@@ -840,9 +840,9 @@ static ssize_t amdgpu_get_pp_od_clk_voltage(struct device *dev,

if (adev->in_suspend && !adev->in_runpm)
return -EPERM;
  
-	ret = pm_runtime_resume_and_get(ddev->dev);

-   if (ret < 0)
-   return ret;
+   ret = pm_runtime_get_if_active(ddev->dev, true);
+   if (ret <= 0)
+   return ret ?: -EPERM;
  
  	for (clk_index = 0 ; clk_index < 6 ; clk_index++) {

ret = amdgpu_dpm_emit_clock_levels(adev, od_clocks[clk_index], buf, 
&size);
@@ -930,9 +930,9 @@ static ssize_t amdgpu_get_pp_features(struct device *dev,
if (adev->in_suspend && !adev->in_runpm)
return -EPERM;
  
-	ret = pm_runtime_resume_and_get(ddev->dev);

-   if (ret < 0)
-   return ret;
+   ret = pm_runtime_get_if_active(ddev->dev, true);
+   if (ret <= 0)
+   return ret ?: -EPERM;
  
  	size = amdgpu_dpm_get_ppfeature_status(adev, buf);

if (size <= 0)
@@ -996,9 +996,9 @@ static ssize_t amdgpu_get_pp_dpm_clock(struct device *dev,
if (adev->in_suspend && !adev->in_runpm)

Re: [PATCH v2 06/08] drm/amdgpu: Add few optimizations to userq fence driver

2024-09-26 Thread Christian König

Am 25.09.24 um 21:59 schrieb Arunpravin Paneer Selvam:

Add few optimizations to userq fence driver.


"Few optimization and fixes for userq fence driver".



v1:(Christian):
   - Remove unnecessary comments.
   - In drm_exec_init call give num_bo_handles as last parameter it would
 making allocation of the array more efficient
   - Handle return value of __xa_store() and improve the error handling of
 amdgpu_userq_fence_driver_alloc().

v2:(Christian):
- Revert userq_xa xarray init to XA_FLAGS_LOCK_IRQ.
- move the xa_unlock before the error check of the call xa_err(__xa_store())
  and moved this change to a separate patch as this is adding a missing 
error
  handling.
- Removed the unnecessary comments.

Signed-off-by: Arunpravin Paneer Selvam 


Reviewed-by: Christian König 


---
  .../gpu/drm/amd/amdgpu/amdgpu_userq_fence.c   | 44 ---
  .../gpu/drm/amd/amdgpu/amdgpu_userq_fence.h   |  2 +-
  drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c |  6 +--
  .../gpu/drm/amd/include/amdgpu_userqueue.h|  2 +-
  4 files changed, 32 insertions(+), 22 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_userq_fence.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_userq_fence.c
index 8d2a0331cd96..f3576c31428c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_userq_fence.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_userq_fence.c
@@ -76,7 +76,8 @@ int amdgpu_userq_fence_driver_alloc(struct amdgpu_device 
*adev,
fence_drv = kzalloc(sizeof(*fence_drv), GFP_KERNEL);
if (!fence_drv) {
DRM_ERROR("Failed to allocate memory for fence driver\n");
-   return -ENOMEM;
+   r = -ENOMEM;
+   goto free_fence_drv;
}
  
  	/* Acquire seq64 memory */

@@ -84,7 +85,8 @@ int amdgpu_userq_fence_driver_alloc(struct amdgpu_device 
*adev,
   &fence_drv->cpu_addr);
if (r) {
kfree(fence_drv);
-   return -ENOMEM;
+   r = -ENOMEM;
+   goto free_seq64;
}
  
  	memset(fence_drv->cpu_addr, 0, sizeof(u64));

@@ -94,7 +96,7 @@ int amdgpu_userq_fence_driver_alloc(struct amdgpu_device 
*adev,
spin_lock_init(&fence_drv->fence_list_lock);
  
  	fence_drv->adev = adev;

-   fence_drv->uq_fence_drv_xa_ref = &userq->uq_fence_drv_xa;
+   fence_drv->fence_drv_xa_ptr = &userq->fence_drv_xa;
fence_drv->context = dma_fence_context_alloc(1);
get_task_comm(fence_drv->timeline_name, current);
  
@@ -106,6 +108,13 @@ int amdgpu_userq_fence_driver_alloc(struct amdgpu_device *adev,

userq->fence_drv = fence_drv;
  
  	return 0;

+
+free_seq64:
+   amdgpu_seq64_free(adev, fence_drv->gpu_addr);
+free_fence_drv:
+   kfree(fence_drv);
+
+   return r;
  }
  
  void amdgpu_userq_fence_driver_process(struct amdgpu_userq_fence_driver *fence_drv)

@@ -147,7 +156,7 @@ void amdgpu_userq_fence_driver_destroy(struct kref *ref)
struct amdgpu_device *adev = fence_drv->adev;
struct amdgpu_userq_fence *fence, *tmp;
struct xarray *xa = &adev->userq_xa;
-   unsigned long index;
+   unsigned long index, flags;
struct dma_fence *f;
  
  	spin_lock(&fence_drv->fence_list_lock);

@@ -164,11 +173,11 @@ void amdgpu_userq_fence_driver_destroy(struct kref *ref)
}
spin_unlock(&fence_drv->fence_list_lock);
  
-	xa_lock(xa);

+   xa_lock_irqsave(xa, flags);
xa_for_each(xa, index, xa_fence_drv)
if (xa_fence_drv == fence_drv)
__xa_erase(xa, index);
-   xa_unlock(xa);
+   xa_unlock_irqrestore(xa, flags);
  
  	/* Free seq64 memory */

amdgpu_seq64_free(adev, fence_drv->gpu_addr);
@@ -212,12 +221,12 @@ int amdgpu_userq_fence_create(struct 
amdgpu_usermode_queue *userq,
amdgpu_userq_fence_driver_get(fence_drv);
dma_fence_get(fence);
  
-	if (!xa_empty(&userq->uq_fence_drv_xa)) {

+   if (!xa_empty(&userq->fence_drv_xa)) {
struct amdgpu_userq_fence_driver *stored_fence_drv;
unsigned long index, count = 0;
int i = 0;
  
-		xa_for_each(&userq->uq_fence_drv_xa, index, stored_fence_drv)

+   xa_for_each(&userq->fence_drv_xa, index, stored_fence_drv)
count++;
  
  		userq_fence->fence_drv_array =

@@ -226,9 +235,9 @@ int amdgpu_userq_fence_create(struct amdgpu_usermode_queue 
*userq,
   GFP_KERNEL);
  
  		if (userq_fence->fence_drv_array) {

-   xa_for_each(&userq->uq_fence_drv_xa, index, 
stored_fence_drv) {
+   xa_for_each(&userq->fence_drv_xa, index, 
stored_fence_drv) {
userq_fence->fence_drv_array[i] = 
stored_fence_drv;
-   xa_erase(&userq->uq_fence_drv_xa, index);
+   xa_erase(&userq->fence_drv_xa, index);
i++;
  

Re: [PATCH v2 07/08] drm/amdgpu: Add the missing error handling for xa_store() call

2024-09-26 Thread Christian König

Am 25.09.24 um 21:59 schrieb Arunpravin Paneer Selvam:

Add the missing error handling for xa_store() call in the function
amdgpu_userq_fence_driver_alloc().

Signed-off-by: Arunpravin Paneer Selvam 


Reviewed-by: Christian König 


---
  drivers/gpu/drm/amd/amdgpu/amdgpu_userq_fence.c | 6 --
  1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_userq_fence.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_userq_fence.c
index f3576c31428c..43429661f62d 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_userq_fence.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_userq_fence.c
@@ -101,9 +101,11 @@ int amdgpu_userq_fence_driver_alloc(struct amdgpu_device 
*adev,
get_task_comm(fence_drv->timeline_name, current);
  
  	xa_lock_irqsave(&adev->userq_xa, flags);

-   __xa_store(&adev->userq_xa, userq->doorbell_index,
-  fence_drv, GFP_KERNEL);
+   r = xa_err(__xa_store(&adev->userq_xa, userq->doorbell_index,
+ fence_drv, GFP_KERNEL));
xa_unlock_irqrestore(&adev->userq_xa, flags);
+   if (r)
+   goto free_seq64;
  
  	userq->fence_drv = fence_drv;
  




Re: [PATCH v2 08/08] drm/amdgpu: add vm root BO lock before accessing the vm

2024-09-26 Thread Christian König

Am 25.09.24 um 21:59 schrieb Arunpravin Paneer Selvam:

Add a vm root BO lock before accessing the userqueue VM.

Signed-off-by: Arunpravin Paneer Selvam 
---
  .../gpu/drm/amd/amdgpu/amdgpu_userq_fence.c   | 19 +++
  1 file changed, 11 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_userq_fence.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_userq_fence.c
index 43429661f62d..52722b738316 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_userq_fence.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_userq_fence.c
@@ -320,7 +320,6 @@ static const struct dma_fence_ops amdgpu_userq_fence_ops = {
  /**
   * amdgpu_userq_fence_read_wptr - Read the userq wptr value
   *
- * @filp: drm file private data structure
   * @queue: user mode queue structure pointer
   * @wptr: write pointer value
   *
@@ -330,23 +329,27 @@ static const struct dma_fence_ops amdgpu_userq_fence_ops 
= {
   *
   * Returns wptr value on success, error on failure.
   */
-static int amdgpu_userq_fence_read_wptr(struct drm_file *filp,
-   struct amdgpu_usermode_queue *queue,
+static int amdgpu_userq_fence_read_wptr(struct amdgpu_usermode_queue *queue,
u64 *wptr)
  {
-   struct amdgpu_fpriv *fpriv = filp->driver_priv;
struct amdgpu_bo_va_mapping *mapping;
-   struct amdgpu_vm *vm = &fpriv->vm;
struct amdgpu_bo *bo;
u64 addr, *ptr;
int r;
  
+	r = amdgpu_bo_reserve(queue->vm->root.bo, false);

+   if (r)
+   return r;
+
addr = queue->userq_prop->wptr_gpu_addr;
addr &= AMDGPU_GMC_HOLE_MASK;
  
-	mapping = amdgpu_vm_bo_lookup_mapping(vm, addr >> PAGE_SHIFT);

-   if (!mapping)
+   mapping = amdgpu_vm_bo_lookup_mapping(queue->vm, addr >> PAGE_SHIFT);
+   amdgpu_bo_unreserve(queue->vm->root.bo);


You need to keep the VM locked until you are done with the mapping. 
Otherwise the mapping could be released at any time.


Regards,
Christian.


+   if (!mapping) {
+   DRM_ERROR("Failed to lookup amdgpu_bo_va_mapping\n");
return -EINVAL;
+   }
  
  	bo = mapping->bo_va->base.bo;


If you only need the BO then grab a temporary BO reference here, drop 
the VM lock and acquire the BO.


When you are done with everything just drop the BO lock and then the 
temporary BO reference.


Regards,
Christian.


r = amdgpu_bo_reserve(bo, true);
@@ -448,7 +451,7 @@ int amdgpu_userq_signal_ioctl(struct drm_device *dev, void 
*data,
goto exec_fini;
}
  
-	r = amdgpu_userq_fence_read_wptr(filp, queue, &wptr);

+   r = amdgpu_userq_fence_read_wptr(queue, &wptr);
if (r)
goto exec_fini;
  




Re: [PATCH v2 02/08] drm/amdgpu: screen freeze and userq driver crash

2024-09-26 Thread Christian König

Am 25.09.24 um 21:59 schrieb Arunpravin Paneer Selvam:

Screen freeze and userq fence driver crash while playing Xonotic

v2: (Christian)
 - There is change that fence might signal in between testing
   and grabbing the lock. Hence we can move the lock above the
   if..else check and use the dma_fence_is_signaled_locked().

Signed-off-by: Arunpravin Paneer Selvam 


Reviewed-by: Christian König 


---
  drivers/gpu/drm/amd/amdgpu/amdgpu_userq_fence.c | 7 ---
  1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_userq_fence.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_userq_fence.c
index 96d1caf4c815..97b1af574407 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_userq_fence.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_userq_fence.c
@@ -185,6 +185,7 @@ int amdgpu_userq_fence_create(struct amdgpu_usermode_queue 
*userq,
struct amdgpu_userq_fence_driver *fence_drv;
struct amdgpu_userq_fence *userq_fence;
struct dma_fence *fence;
+   unsigned long flags;
  
  	fence_drv = userq->fence_drv;

if (!fence_drv)
@@ -232,14 +233,14 @@ int amdgpu_userq_fence_create(struct 
amdgpu_usermode_queue *userq,
userq_fence->fence_drv_array_count = 0;
}
  
-	spin_lock(&fence_drv->fence_list_lock);

/* Check if hardware has already processed the job */
-   if (!dma_fence_is_signaled(fence))
+   spin_lock_irqsave(&fence_drv->fence_list_lock, flags);
+   if (!dma_fence_is_signaled_locked(fence))
list_add_tail(&userq_fence->link, &fence_drv->fences);
else
dma_fence_put(fence);
  
-	spin_unlock(&fence_drv->fence_list_lock);

+   spin_unlock_irqrestore(&fence_drv->fence_list_lock, flags);
  
  	*f = fence;
  




Re: [PATCH] drm/amdgpu: bump driver version for cleared VRAM

2024-09-26 Thread Alex Deucher
Ping?

On Fri, Sep 6, 2024 at 2:04 PM Alex Deucher  wrote:
>
> Driver now clears VRAM on allocation.  Bump the
> driver version so mesa knows when it will get
> cleared vram by default.
>
> Signed-off-by: Alex Deucher 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> index d90473aec942..fad556be840b 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> @@ -117,9 +117,10 @@
>   * - 3.56.0 - Update IB start address and size alignment for decode and 
> encode
>   * - 3.57.0 - Compute tunneling on GFX10+
>   * - 3.58.0 - Add GFX12 DCC support
> + * - 3.59.0 - Cleared VRAM
>   */
>  #define KMS_DRIVER_MAJOR   3
> -#define KMS_DRIVER_MINOR   58
> +#define KMS_DRIVER_MINOR   59
>  #define KMS_DRIVER_PATCHLEVEL  0
>
>  /*
> --
> 2.46.0
>


RE: [PATCH] drm/amdgpu: bump driver version for cleared VRAM

2024-09-26 Thread Bhardwaj, Rajneesh
[Public]

Reviewed-by: Rajneesh Bhardwaj 

-Original Message-
From: amd-gfx  On Behalf Of Alex Deucher
Sent: Thursday, September 26, 2024 9:46 AM
To: Deucher, Alexander ; Olsak, Marek 

Cc: amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/amdgpu: bump driver version for cleared VRAM

Ping?

On Fri, Sep 6, 2024 at 2:04 PM Alex Deucher  wrote:
>
> Driver now clears VRAM on allocation.  Bump the driver version so mesa
> knows when it will get cleared vram by default.
>
> Signed-off-by: Alex Deucher 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> index d90473aec942..fad556be840b 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> @@ -117,9 +117,10 @@
>   * - 3.56.0 - Update IB start address and size alignment for decode and 
> encode
>   * - 3.57.0 - Compute tunneling on GFX10+
>   * - 3.58.0 - Add GFX12 DCC support
> + * - 3.59.0 - Cleared VRAM
>   */
>  #define KMS_DRIVER_MAJOR   3
> -#define KMS_DRIVER_MINOR   58
> +#define KMS_DRIVER_MINOR   59
>  #define KMS_DRIVER_PATCHLEVEL  0
>
>  /*
> --
> 2.46.0
>


Re: [PATCH] drm/amdgpu: bump driver version for cleared VRAM

2024-09-26 Thread Marek Olšák
Is GTT cleared too?

Marek

On Thu, Sep 26, 2024, 09:53 Alex Deucher  wrote:

> Ping?
>
> On Fri, Sep 6, 2024 at 2:04 PM Alex Deucher 
> wrote:
> >
> > Driver now clears VRAM on allocation.  Bump the
> > driver version so mesa knows when it will get
> > cleared vram by default.
> >
> > Signed-off-by: Alex Deucher 
> > ---
> >  drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 3 ++-
> >  1 file changed, 2 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> > index d90473aec942..fad556be840b 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> > @@ -117,9 +117,10 @@
> >   * - 3.56.0 - Update IB start address and size alignment for decode and
> encode
> >   * - 3.57.0 - Compute tunneling on GFX10+
> >   * - 3.58.0 - Add GFX12 DCC support
> > + * - 3.59.0 - Cleared VRAM
> >   */
> >  #define KMS_DRIVER_MAJOR   3
> > -#define KMS_DRIVER_MINOR   58
> > +#define KMS_DRIVER_MINOR   59
> >  #define KMS_DRIVER_PATCHLEVEL  0
> >
> >  /*
> > --
> > 2.46.0
> >
>


Re: [PATCH] drm/amdgpu: bump driver version for cleared VRAM

2024-09-26 Thread Alex Deucher
On Thu, Sep 26, 2024 at 10:00 AM Marek Olšák  wrote:
>
> Is GTT cleared too?

GTT has always been cleared since it's system memory.

Alex


>
> Marek
>
> On Thu, Sep 26, 2024, 09:53 Alex Deucher  wrote:
>>
>> Ping?
>>
>> On Fri, Sep 6, 2024 at 2:04 PM Alex Deucher  
>> wrote:
>> >
>> > Driver now clears VRAM on allocation.  Bump the
>> > driver version so mesa knows when it will get
>> > cleared vram by default.
>> >
>> > Signed-off-by: Alex Deucher 
>> > ---
>> >  drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 3 ++-
>> >  1 file changed, 2 insertions(+), 1 deletion(-)
>> >
>> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
>> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>> > index d90473aec942..fad556be840b 100644
>> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>> > @@ -117,9 +117,10 @@
>> >   * - 3.56.0 - Update IB start address and size alignment for decode and 
>> > encode
>> >   * - 3.57.0 - Compute tunneling on GFX10+
>> >   * - 3.58.0 - Add GFX12 DCC support
>> > + * - 3.59.0 - Cleared VRAM
>> >   */
>> >  #define KMS_DRIVER_MAJOR   3
>> > -#define KMS_DRIVER_MINOR   58
>> > +#define KMS_DRIVER_MINOR   59
>> >  #define KMS_DRIVER_PATCHLEVEL  0
>> >
>> >  /*
>> > --
>> > 2.46.0
>> >


Re: [PATCH] drm/amdgpu: simplify vram alloc logic since 2GB limitation removed

2024-09-26 Thread Alex Deucher
On Thu, Sep 26, 2024 at 10:45 AM Zhang, Yifan  wrote:
>
> [AMD Official Use Only - AMD Internal Distribution Only]
>
> -Original Message-
> From: Alex Deucher 
> Sent: Thursday, September 26, 2024 1:17 AM
> To: Zhang, Yifan 
> Cc: amd-gfx@lists.freedesktop.org; Yang, Philip ; 
> Kuehling, Felix 
> Subject: Re: [PATCH] drm/amdgpu: simplify vram alloc logic since 2GB 
> limitation removed
>
> On Tue, Sep 24, 2024 at 10:22 AM Zhang, Yifan  wrote:
> >
> > [AMD Official Use Only - AMD Internal Distribution Only]
> >
> > 2GB limitation in VRAM allocation is removed in below patch. My patch is a 
> > follow up refine for this. The remaing_size calculation was to address the 
> > 2GB limitation in contiguous VRAM allocation, and no longer needed after 
> > the limitation is removed.
> >
>
> Thanks.  Would be good to add a reference to this commit in the commit 
> message so it's clear what you are talking about and also provide a bit more 
> of a description about why this can be cleaned up (like you did above).  One 
> other comment below as well.
>
> Sure, I will send patch V2 to change commit message.
>
> > commit b2dba064c9bdd18c7dd39066d25453af28451dbf
> > Author: Philip Yang 
> > Date:   Fri Apr 19 16:27:00 2024 -0400
> >
> > drm/amdgpu: Handle sg size limit for contiguous allocation
> >
> > Define macro AMDGPU_MAX_SG_SEGMENT_SIZE 2GB, because struct scatterlist
> > length is unsigned int, and some users of it cast to a signed int, so
> > every segment of sg table is limited to size 2GB maximum.
> >
> > For contiguous VRAM allocation, don't limit the max buddy block size in
> > order to get contiguous VRAM memory. To workaround the sg table segment
> > size limit, allocate multiple segments if contiguous size is bigger than
> > AMDGPU_MAX_SG_SEGMENT_SIZE.
> >
> > Signed-off-by: Philip Yang 
> > Reviewed-by: Christian König 
> > Signed-off-by: Alex Deucher 
> >
> >
> >
> > -Original Message-
> > From: Alex Deucher 
> > Sent: Tuesday, September 24, 2024 10:07 PM
> > To: Zhang, Yifan 
> > Cc: amd-gfx@lists.freedesktop.org; Yang, Philip ;
> > Kuehling, Felix 
> > Subject: Re: [PATCH] drm/amdgpu: simplify vram alloc logic since 2GB
> > limitation removed
> >
> > On Mon, Sep 23, 2024 at 4:28 AM Yifan Zhang  wrote:
> > >
> > > Make vram alloc loop simpler after 2GB limitation removed.
> >
> > Can you provide more context?  What 2GB limitation are you referring to?
> >
> > Alex
> >
> > >
> > > Signed-off-by: Yifan Zhang 
> > > ---
> > >  drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c | 15 +--
> > >  1 file changed, 5 insertions(+), 10 deletions(-)
> > >
> > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
> > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
> > > index 7d26a962f811..3d129fd61fa7 100644
> > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
> > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
> > > @@ -455,7 +455,7 @@ static int amdgpu_vram_mgr_new(struct 
> > > ttm_resource_manager *man,
> > > struct amdgpu_bo *bo = ttm_to_amdgpu_bo(tbo);
> > > u64 vis_usage = 0, max_bytes, min_block_size;
> > > struct amdgpu_vram_mgr_resource *vres;
> > > -   u64 size, remaining_size, lpfn, fpfn;
> > > +   u64 size, lpfn, fpfn;
> > > unsigned int adjust_dcc_size = 0;
> > > struct drm_buddy *mm = &mgr->mm;
> > > struct drm_buddy_block *block; @@ -516,25 +516,23 @@ static
> > > int amdgpu_vram_mgr_new(struct ttm_resource_manager *man,
> > > adev->gmc.gmc_funcs->get_dcc_alignment)
> > > adjust_dcc_size =
> > > amdgpu_gmc_get_dcc_alignment(adev);
> > >
> > > -   remaining_size = (u64)vres->base.size;
> > > +   size = (u64)vres->base.size;
> > > if (bo->flags & AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS && 
> > > adjust_dcc_size) {
> > > unsigned int dcc_size;
> > >
> > > dcc_size = roundup_pow_of_two(vres->base.size + 
> > > adjust_dcc_size);
> > > -   remaining_size = (u64)dcc_size;
> > > +   size = (u64)dcc_size;
> > >
> > > vres->flags |= DRM_BUDDY_TRIM_DISABLE;
> > > }
> > >
> > > mutex_lock(&mgr->lock);
> > > -   while (remaining_size) {
> > > +   while (true) {
>
> I think you can also drop this while loop now too.
>
> Alex
>
> There is still a "continue" path left in the loop, I think the logic may be 
> broken here if loop dropped.

Ah, yes, sorry I missed that continue.

Alex

>
>r = drm_buddy_alloc_blocks(mm, fpfn,
> lpfn,
> size,
> min_block_size,
> &vres->blocks,
> vres->flags);
>
>  if (unlikely(r == -ENOSPC) && pages_per_block == ~0ul &&
>  !(place->flags & TTM_PL_FLAG_CONTIGUOUS)) {
>  vres->flags &= ~DRM_BUDDY_CONTIGUOUS_ALLOCATION;
>  pages_per_block = max_t(u32, 2

Re: [PATCH v7 07/10] drm/amd/display: get SAD from drm_eld when parsing EDID caps

2024-09-26 Thread Alex Hung

Mario and Melissa,

Another regression identified on this patch - DP Display is not listed 
as an audio device after this patch is applied.


Cheers,
Alex Hung


On 9/18/24 15:38, Mario Limonciello wrote:

From: Melissa Wen 

drm_edid_connector_update() updates display info, filling ELD with audio
info from Short-Audio Descriptors in the last step of
update_dislay_info(). Our goal is stopping using raw edid, so we can
extract SAD from drm_eld instead of access raw edid to get audio caps.

Signed-off-by: Melissa Wen 
Signed-off-by: Mario Limonciello 
---
  .../amd/display/amdgpu_dm/amdgpu_dm_helpers.c | 20 +--
  1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c
index 854e51c0b3fb..e0326127fd9a 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c
@@ -98,9 +98,7 @@ enum dc_edid_status dm_helpers_parse_edid_caps(
const struct drm_edid *drm_edid = aconnector->drm_edid;
struct drm_edid_product_id product_id;
struct edid *edid_buf = edid ? (struct edid *) edid->raw_edid : NULL;
-   struct cea_sad *sads;
-   int sad_count = -1;
-   int sadb_count = -1;
+   int sad_count, sadb_count;
int i = 0;
uint8_t *sadb = NULL;
  
@@ -129,18 +127,21 @@ enum dc_edid_status dm_helpers_parse_edid_caps(
  
  	apply_edid_quirks(&product_id, edid_caps);
  
-	sad_count = drm_edid_to_sad((struct edid *) edid->raw_edid, &sads);

+   sad_count = drm_eld_sad_count(connector->eld);
if (sad_count <= 0)
return result;
  
  	edid_caps->audio_mode_count = min(sad_count, DC_MAX_AUDIO_DESC_COUNT);

for (i = 0; i < edid_caps->audio_mode_count; ++i) {
-   struct cea_sad *sad = &sads[i];
+   struct cea_sad sad;
  
-		edid_caps->audio_modes[i].format_code = sad->format;

-   edid_caps->audio_modes[i].channel_count = sad->channels + 1;
-   edid_caps->audio_modes[i].sample_rate = sad->freq;
-   edid_caps->audio_modes[i].sample_size = sad->byte2;
+   if (drm_eld_sad_get(connector->eld, i, &sad) < 0)
+   continue;
+
+   edid_caps->audio_modes[i].format_code = sad.format;
+   edid_caps->audio_modes[i].channel_count = sad.channels + 1;
+   edid_caps->audio_modes[i].sample_rate = sad.freq;
+   edid_caps->audio_modes[i].sample_size = sad.byte2;
}
  
  	sadb_count = drm_edid_to_speaker_allocation((struct edid *) edid->raw_edid, &sadb);

@@ -155,7 +156,6 @@ enum dc_edid_status dm_helpers_parse_edid_caps(
else
edid_caps->speaker_flags = DEFAULT_SPEAKER_LOCATION;
  
-	kfree(sads);

kfree(sadb);
  
  	return result;




Re: [PATCH 6/7] drm/amdgpu: Check gmc requirement for reset on init

2024-09-26 Thread Xu, Feifei

>>I guess you are referring to the below corner case


 1) Place NPS request



 2) Unload Driver



 3) Reinstall driver with a different TOS (possible but quite unlikely)



4) Driver reload



5) Driver checks TOS version first and goes for a reset



6) reset_flag of GMC is not set, hence it doesn't refresh the NPS range.


>>I think changing the order in soc15_need_reset_on_init() to check for 
NPS request before TOS version check will solve this.


Yes, I was thinking of  reset_flag and tOS reloading 
(adev->init_lvl->level set to AMDGPU_INIT_LEVEL_MINIMAL_XGMI) changing 
at the same time. And NPS refresh will be ignored. Though might be 
likely in debugging or regression isolation cases which changing driver 
packaged with different TOS.  And yes making NPS refresh checking before 
TOS version checking will solve this.


And if we do not return ahead when checking NPS request before tOS 
version change in soc15_need_reset_on_init(), we can drop 
(adev->init_lvl->level != AMDGPU_INIT_LEVEL_MINIMAL_XGMI) check in below 
refresh checking:


+ refresh = (adev->init_lvl->level != AMDGPU_INIT_LEVEL_MINIMAL_XGMI) &&
+   (adev->gmc.reset_flags & AMDGPU_GMC_INIT_RESET_NPS);

refresh = (adev->gmc.reset_flags & AMDGPU_GMC_INIT_RESET_NPS);

Thanks
Feifei

On 9/26/2024 5:01 PM, Xu, Feifei wrote:

+ refresh = (adev->init_lvl->level != AMDGPU_INIT_LEVEL_MINIMAL_XGMI) &&
+   (adev->gmc.reset_flags & AMDGPU_GMC_INIT_RESET_NPS);

Is there a corner case that reloading with a different version tos and 
refreshing nps change co-exist?

Thanks,
Feifei

-Original Message-
From: amd-gfx  On Behalf Of Lijo Lazar
Sent: Tuesday, September 24, 2024 1:57 PM
To:amd-gfx@lists.freedesktop.org
Cc: Zhang, Hawking; Deucher, Alexander; Koenig, 
Christian; Bhardwaj, Rajneesh; Errabolu, 
Ramesh
Subject: [PATCH 6/7] drm/amdgpu: Check gmc requirement for reset on init

Add a callback to check if there is any condition detected by GMC block for 
reset on init. One case is if a pending NPS change request is detected. If 
reset is done because of NPS switch, refresh NPS info from discovery table.

Signed-off-by: Lijo Lazar
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 13 -  
drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h |  5 +
  drivers/gpu/drm/amd/amdgpu/soc15.c  |  2 ++
  3 files changed, 19 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
index 21f1e65c9dc9..011fe3a847d0 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
@@ -1261,12 +1261,15 @@ int amdgpu_gmc_get_nps_memranges(struct amdgpu_device 
*adev,
 struct amdgpu_gmc_memrange *ranges;
 int range_cnt, ret, i, j;
 uint32_t nps_type;
+   bool refresh;

 if (!mem_ranges)
 return -EINVAL;

+   refresh = (adev->init_lvl->level != AMDGPU_INIT_LEVEL_MINIMAL_XGMI) &&
+ (adev->gmc.reset_flags & AMDGPU_GMC_INIT_RESET_NPS);
 ret = amdgpu_discovery_get_nps_info(adev, &nps_type, &ranges,
-   &range_cnt, false);
+   &range_cnt, refresh);

 if (ret)
 return ret;
@@ -1392,3 +1395,11 @@ void amdgpu_gmc_prepare_nps_mode_change(struct 
amdgpu_device *adev)
 adev->dev,
 "NPS mode change request done, reload driver to complete 
the change\n");  }
+
+bool amdgpu_gmc_need_reset_on_init(struct amdgpu_device *adev) {
+   if (adev->gmc.gmc_funcs->need_reset_on_init)
+   return adev->gmc.gmc_funcs->need_reset_on_init(adev);
+
+   return false;
+}
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h
index b13d6adb5efd..d4cd247fe574 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h
@@ -78,6 +78,8 @@ enum amdgpu_memory_partition {
  BIT(AMDGPU_NPS3_PARTITION_MODE) | BIT(AMDGPU_NPS4_PARTITION_MODE) | \
  BIT(AMDGPU_NPS6_PARTITION_MODE) | BIT(AMDGPU_NPS8_PARTITION_MODE))

+#define AMDGPU_GMC_INIT_RESET_NPS  BIT(0)
+
  /*
   * GMC page fault information
   */
@@ -169,6 +171,7 @@ struct amdgpu_gmc_funcs {
 /* Request NPS mode */
 int (*request_mem_partition_mode)(struct amdgpu_device *adev,
   int nps_mode);
+   bool (*need_reset_on_init)(struct amdgpu_device *adev);
  };

  struct amdgpu_xgmi_ras {
@@ -314,6 +317,7 @@ struct amdgpu_gmc {
 const struct amdgpu_gmc_funcs   *gmc_funcs;
 enum amdgpu_memory_partitionrequested_nps_mode;
 uint32_t supported_nps_modes;
+   uint32_t reset_flags;

 struct amdgpu_xgmi xgmi;
 struct amdgpu_irq_src   ecc_irq;
@@ -468,5 +472,6 @@ int amdgpu_gmc_get_nps_memranges(struct amdgpu_device 
*adev,  int amdgpu_gmc_request_memory_partition(struct amdg

[PATCH v3] docs/gpu: ci: update flake tests requirements

2024-09-26 Thread Vignesh Raman
Update the documentation to specify linking to a relevant GitLab
issue or email report for each new flake entry. Added specific
GitLab issue urls for i915, msm and amdgpu driver.

Acked-by: Abhinav Kumar  # msm
Acked-by: Dmitry Baryshkov  # msm
Signed-off-by: Vignesh Raman 
---

v2:
- Add gitlab issue link for msm driver.

v3:
- Update docs to specify we use email reporting or GitLab issues for flake 
entries.

---
 Documentation/gpu/automated_testing.rst | 13 +
 1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/Documentation/gpu/automated_testing.rst 
b/Documentation/gpu/automated_testing.rst
index 2d5a28866afe..03769b4a17cf 100644
--- a/Documentation/gpu/automated_testing.rst
+++ b/Documentation/gpu/automated_testing.rst
@@ -68,19 +68,24 @@ known to behave unreliably. These tests won't cause a job 
to fail regardless of
 the result. They will still be run.
 
 Each new flake entry must be associated with a link to the email reporting the
-bug to the author of the affected driver, the board name or Device Tree name of
-the board, the first kernel version affected, the IGT version used for tests,
-and an approximation of the failure rate.
+bug to the author of the affected driver or the relevant GitLab issue. The 
entry
+must also include the board name or Device Tree name, the first kernel version
+affected, the IGT version used for tests, and an approximation of the failure 
rate.
 
 They should be provided under the following format::
 
-  # Bug Report: $LORE_OR_PATCHWORK_URL
+  # Bug Report: $LORE_URL_OR_GITLAB_ISSUE
   # Board Name: broken-board.dtb
   # Linux Version: 6.6-rc1
   # IGT Version: 1.28-gd2af13d9f
   # Failure Rate: 100
   flaky-test
 
+Use the appropriate link below to create a GitLab issue:
+amdgpu driver: https://gitlab.freedesktop.org/drm/amd/-/issues
+i915 driver: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues
+msm driver: https://gitlab.freedesktop.org/drm/msm/-/issues
+
 drivers/gpu/drm/ci/${DRIVER_NAME}-${HW_REVISION}-skips.txt
 ---
 
-- 
2.43.0



Re: [PATCH v2] docs/gpu: ci: update flake tests requirements

2024-09-26 Thread Abhinav Kumar




On 9/26/2024 12:06 AM, Vignesh Raman wrote:

Update the documentation to require linking to a relevant GitLab
issue for each new flake entry instead of an email report. Added
specific GitLab issue URLs for i915, xe and other drivers.

Signed-off-by: Vignesh Raman 
---

v2:
- Add gitlab issue link for msm driver.

---
  Documentation/gpu/automated_testing.rst | 16 +++-
  1 file changed, 11 insertions(+), 5 deletions(-)

diff --git a/Documentation/gpu/automated_testing.rst 
b/Documentation/gpu/automated_testing.rst
index 2d5a28866afe..f918fe56f2b0 100644
--- a/Documentation/gpu/automated_testing.rst
+++ b/Documentation/gpu/automated_testing.rst
@@ -67,20 +67,26 @@ Lists the tests that for a given driver on a specific 
hardware revision are
  known to behave unreliably. These tests won't cause a job to fail regardless 
of
  the result. They will still be run.
  
-Each new flake entry must be associated with a link to the email reporting the

-bug to the author of the affected driver, the board name or Device Tree name of
-the board, the first kernel version affected, the IGT version used for tests,
-and an approximation of the failure rate.
+Each new flake entry must include a link to the relevant GitLab issue, the 
board
+name or Device Tree name, the first kernel version affected, the IGT version 
used
+for tests and an approximation of the failure rate.
  
  They should be provided under the following format::
  
-  # Bug Report: $LORE_OR_PATCHWORK_URL

+  # Bug Report: $GITLAB_ISSUE
# Board Name: broken-board.dtb
# Linux Version: 6.6-rc1
# IGT Version: 1.28-gd2af13d9f
# Failure Rate: 100
flaky-test
  
+The GitLab issue must include the logs and the pipeline link. Use the appropriate

+link below to create an issue.
+https://gitlab.freedesktop.org/drm/i915/kernel/-/issues for i915 driver
+https://gitlab.freedesktop.org/drm/xe/kernel/-/issues for xe driver
+https://gitlab.freedesktop.org/drm/msm/-/issues for msm driver
+https://gitlab.freedesktop.org/drm/misc/kernel/-/issues for other drivers
+


Acked for MSM.

Thanks

Abhinav

  drivers/gpu/drm/ci/${DRIVER_NAME}-${HW_REVISION}-skips.txt
  ---
  


Re: [PATCH 6/7] drm/amdgpu: Check gmc requirement for reset on init

2024-09-26 Thread Lazar, Lijo



On 9/27/2024 9:26 AM, Xu, Feifei wrote:
>>>I guess you are referring to the below corner case
> 
>>>    1) Place NPS request
> 
>>>     2) Unload Driver
> 
>>>    3) Reinstall driver with a different TOS (possible but quite
> unlikely)
> 
>>>     4) Driver reload
> 
>>>     5) Driver checks TOS version first and goes for a reset
> 
>>>     6) reset_flag of GMC is not set, hence it doesn't refresh
> the NPS range.
> 
>  
> 
>  
> 
>>>I think changing the order in soc15_need_reset_on_init() to check for
> NPS request before TOS version check will solve this.
> 
> Yes, I was thinking of  reset_flag and tOS reloading
> (adev->init_lvl->level set to AMDGPU_INIT_LEVEL_MINIMAL_XGMI) changing
> at the same time. And NPS refresh will be ignored. Though might be
> likely in debugging or regression isolation cases which changing driver
> packaged with different TOS.  And yes making NPS refresh checking before
> TOS version checking will solve this.
> 
> And if we do not return ahead when checking NPS request before tOS
> version change in soc15_need_reset_on_init(), we can drop 
> (adev->init_lvl->level != AMDGPU_INIT_LEVEL_MINIMAL_XGMI) check in below
> refresh checking:
> 

This check is used when NPS request change is identified. During swinit
part it will be at MINIMAL_XGMI level, but at that point there is no
need to refresh this information as reset is pending. It is refreshed
after a reset when init level returns to default.

Thanks,
Lijo

> + refresh = (adev->init_lvl->level != AMDGPU_INIT_LEVEL_MINIMAL_XGMI) &&
> +   (adev->gmc.reset_flags & AMDGPU_GMC_INIT_RESET_NPS);
> 
> refresh = (adev->gmc.reset_flags & AMDGPU_GMC_INIT_RESET_NPS);
> 
> Thanks
> Feifei
> 
> On 9/26/2024 5:01 PM, Xu, Feifei wrote:
 + refresh = (adev->init_lvl->level != AMDGPU_INIT_LEVEL_MINIMAL_XGMI) 
 &&
 +   (adev->gmc.reset_flags & AMDGPU_GMC_INIT_RESET_NPS);
>> Is there a corner case that reloading with a different version tos and 
>> refreshing nps change co-exist?
>>
>> Thanks,
>> Feifei
>>
>> -Original Message-
>> From: amd-gfx  On Behalf Of Lijo Lazar
>> Sent: Tuesday, September 24, 2024 1:57 PM
>> To: amd-gfx@lists.freedesktop.org
>> Cc: Zhang, Hawking ; Deucher, Alexander 
>> ; Koenig, Christian ; 
>> Bhardwaj, Rajneesh ; Errabolu, Ramesh 
>> 
>> Subject: [PATCH 6/7] drm/amdgpu: Check gmc requirement for reset on init
>>
>> Add a callback to check if there is any condition detected by GMC block for 
>> reset on init. One case is if a pending NPS change request is detected. If 
>> reset is done because of NPS switch, refresh NPS info from discovery table.
>>
>> Signed-off-by: Lijo Lazar 
>> ---
>>  drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 13 -  
>> drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h |  5 +
>>  drivers/gpu/drm/amd/amdgpu/soc15.c  |  2 ++
>>  3 files changed, 19 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
>> index 21f1e65c9dc9..011fe3a847d0 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
>> @@ -1261,12 +1261,15 @@ int amdgpu_gmc_get_nps_memranges(struct 
>> amdgpu_device *adev,
>> struct amdgpu_gmc_memrange *ranges;
>> int range_cnt, ret, i, j;
>> uint32_t nps_type;
>> +   bool refresh;
>>
>> if (!mem_ranges)
>> return -EINVAL;
>>
>> +   refresh = (adev->init_lvl->level != AMDGPU_INIT_LEVEL_MINIMAL_XGMI) 
>> &&
>> + (adev->gmc.reset_flags & AMDGPU_GMC_INIT_RESET_NPS);
>> ret = amdgpu_discovery_get_nps_info(adev, &nps_type, &ranges,
>> -   &range_cnt, false);
>> +   &range_cnt, refresh);
>>
>> if (ret)
>> return ret;
>> @@ -1392,3 +1395,11 @@ void amdgpu_gmc_prepare_nps_mode_change(struct 
>> amdgpu_device *adev)
>> adev->dev,
>> "NPS mode change request done, reload driver to 
>> complete the change\n");  }
>> +
>> +bool amdgpu_gmc_need_reset_on_init(struct amdgpu_device *adev) {
>> +   if (adev->gmc.gmc_funcs->need_reset_on_init)
>> +   return adev->gmc.gmc_funcs->need_reset_on_init(adev);
>> +
>> +   return false;
>> +}
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h
>> index b13d6adb5efd..d4cd247fe574 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h
>> @@ -78,6 +78,8 @@ enum amdgpu_memory_partition {
>>  BIT(AMDGPU_NPS3_PARTITION_MODE) | BIT(AMDGPU_NPS4_PARTITION_MODE) | 
>> \
>>  BIT(AMDGPU_NPS6_PARTITION_MODE) | BIT(AMDGPU_NPS8_PARTITION_MODE))
>>
>> +#define AMDGPU_GMC_INIT_RESET_NPS  BIT(0)
>> +
>>  /*
>>   * GMC page fault information
>>   */
>> @@ -169,6 +171,7 @@ struct amdgpu_gmc_fun

[PATCH v2 0/7] Add support for dynamic NPS switch

2024-09-26 Thread Lijo Lazar
This series adds supports for dynamic NPS switch on GC v9.4.3/9.4.4 SOC
variants. 

In order to do dynamic NPS switch a sysfs interface is provided to request a new
NPS mode. If the device is part of a hive, all hive devices are required to be
in the same NPS mode. Hence a hive device request is saved in a hive variable.
For individual device, it's saved in a gmc block variable.

In order to do a NPS mode switch, the workflow is -

1) User places a requests through sysfs node.
2) User unloads the driver
3) During unload, driver checks for any pending NPS switch request. If any
request is pending, it places the request to PSP FW. 
4) For a hive, request is placed in one-go for all devices in the hive. If
one of the requests fails, a request is placed again to revert to current NPS
mode on the successful devices.
5) User reloads the driver.
6) On reload, driver checks if NPS switch is pending and initiates a mode-1
reset.
7) During resume after a reset, NPS ranges are read again from discovery table.
8) Driver detects the new NPS mode and makes a compatible compute partition mode
switch if required.

v2:
Move NPS request check ahead of TOS reload requirement check (Feifei)

Lijo Lazar (7):
  drm/amdgpu: Add option to refresh NPS data
  drm/amdgpu: Add PSP interface for NPS switch
  drm/amdgpu: Add gmc interface to request NPS mode
  drm/amdgpu: Add sysfs interfaces for NPS mode
  drm/amdgpu: Place NPS mode request on unload
  drm/amdgpu: Check gmc requirement for reset on init
  drm/amdgpu: Add NPS switch support for GC 9.4.3

 drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c |  68 +--
 drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.h |   2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c   |   1 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c   | 190 --
 drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h   |  19 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_nbio.h  |   1 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c   |  25 +++
 drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h   |   1 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c  |  39 
 drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h  |   5 +
 drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c |  44 
 drivers/gpu/drm/amd/amdgpu/nbio_v7_9.c|  12 ++
 drivers/gpu/drm/amd/amdgpu/psp_gfx_if.h   |  14 +-
 drivers/gpu/drm/amd/amdgpu/soc15.c|   2 +
 14 files changed, 387 insertions(+), 36 deletions(-)

-- 
2.25.1



[PATCH v2 2/7] drm/amdgpu: Add PSP interface for NPS switch

2024-09-26 Thread Lijo Lazar
Implement PSP ring command interface for memory partitioning on the fly
on the supported asics.

Signed-off-by: Rajneesh Bhardwaj 
Reviewed-by: Feifei Xu 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 25 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h |  1 +
 drivers/gpu/drm/amd/amdgpu/psp_gfx_if.h | 14 +++---
 3 files changed, 37 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
index 944dad9ad29f..04be0fabb4f5 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
@@ -1045,6 +1045,31 @@ static int psp_rl_load(struct amdgpu_device *adev)
return ret;
 }
 
+int psp_memory_partition(struct psp_context *psp, int mode)
+{
+   struct psp_gfx_cmd_resp *cmd;
+   int ret;
+
+   if (amdgpu_sriov_vf(psp->adev))
+   return 0;
+
+   cmd = acquire_psp_cmd_buf(psp);
+
+   cmd->cmd_id = GFX_CMD_ID_FB_NPS_MODE;
+   cmd->cmd.cmd_memory_part.mode = mode;
+
+   dev_info(psp->adev->dev,
+"Requesting %d memory partition change through PSP", mode);
+   ret = psp_cmd_submit_buf(psp, NULL, cmd, psp->fence_buf_mc_addr);
+   if (ret)
+   dev_err(psp->adev->dev,
+   "PSP request failed to change to NPS%d mode\n", mode);
+
+   release_psp_cmd_buf(psp);
+
+   return ret;
+}
+
 int psp_spatial_partition(struct psp_context *psp, int mode)
 {
struct psp_gfx_cmd_resp *cmd;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
index 76fa18ffc045..567cb1f924ca 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
@@ -553,6 +553,7 @@ int psp_load_fw_list(struct psp_context *psp,
 void psp_copy_fw(struct psp_context *psp, uint8_t *start_addr, uint32_t 
bin_size);
 
 int psp_spatial_partition(struct psp_context *psp, int mode);
+int psp_memory_partition(struct psp_context *psp, int mode);
 
 int is_psp_fw_valid(struct psp_bin_desc bin);
 
diff --git a/drivers/gpu/drm/amd/amdgpu/psp_gfx_if.h 
b/drivers/gpu/drm/amd/amdgpu/psp_gfx_if.h
index 604301371e4f..f4a91b126c73 100644
--- a/drivers/gpu/drm/amd/amdgpu/psp_gfx_if.h
+++ b/drivers/gpu/drm/amd/amdgpu/psp_gfx_if.h
@@ -103,8 +103,10 @@ enum psp_gfx_cmd_id
 GFX_CMD_ID_AUTOLOAD_RLC   = 0x0021,   /* Indicates all graphics fw 
loaded, start RLC autoload */
 GFX_CMD_ID_BOOT_CFG   = 0x0022,   /* Boot Config */
 GFX_CMD_ID_SRIOV_SPATIAL_PART = 0x0027,   /* Configure spatial 
partitioning mode */
-   /*IDs of performance monitoring/profiling*/
-   GFX_CMD_ID_CONFIG_SQ_PERFMON = 0x0046,   /* Config CGTT_SQ_CLK_CTRL 
*/
+/*IDs of performance monitoring/profiling*/
+GFX_CMD_ID_CONFIG_SQ_PERFMON  = 0x0046,   /* Config CGTT_SQ_CLK_CTRL */
+/* Dynamic memory partitioninig (NPS mode change)*/
+GFX_CMD_ID_FB_NPS_MODE = 0x0048,  /* Configure memory 
partitioning mode */
 };
 
 /* PSP boot config sub-commands */
@@ -362,6 +364,11 @@ struct psp_gfx_cmd_config_sq_perfmon {
uint8_t reserved[5];
 };
 
+struct psp_gfx_cmd_fb_memory_part {
+   uint32_t mode; /* requested NPS mode */
+   uint32_t resvd;
+};
+
 /* All GFX ring buffer commands. */
 union psp_gfx_commands
 {
@@ -376,7 +383,8 @@ union psp_gfx_commands
 struct psp_gfx_cmd_load_toc cmd_load_toc;
 struct psp_gfx_cmd_boot_cfg boot_cfg;
 struct psp_gfx_cmd_sriov_spatial_part cmd_spatial_part;
-   struct psp_gfx_cmd_config_sq_perfmon config_sq_perfmon;
+struct psp_gfx_cmd_config_sq_perfmon config_sq_perfmon;
+struct psp_gfx_cmd_fb_memory_part cmd_memory_part;
 };
 
 struct psp_gfx_uresp_reserved
-- 
2.25.1



[PATCH v2 1/7] drm/amdgpu: Add option to refresh NPS data

2024-09-26 Thread Lijo Lazar
In certain use cases, NPS data needs to be refreshed again from
discovery table. Add API parameter to refresh NPS data from discovery
table.

Signed-off-by: Lijo Lazar 
Reviewed-by: Rajneesh Bhardwaj 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c | 68 +++
 drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.h |  2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c   |  2 +-
 3 files changed, 55 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
index 4bd61c169ca8..9f9a1867da72 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
@@ -1723,37 +1723,75 @@ union nps_info {
struct nps_info_v1_0 v1;
 };
 
+static int amdgpu_discovery_refresh_nps_info(struct amdgpu_device *adev,
+union nps_info *nps_data)
+{
+   uint64_t vram_size, pos, offset;
+   struct nps_info_header *nhdr;
+   struct binary_header bhdr;
+   uint16_t checksum;
+
+   vram_size = (uint64_t)RREG32(mmRCC_CONFIG_MEMSIZE) << 20;
+   pos = vram_size - DISCOVERY_TMR_OFFSET;
+   amdgpu_device_vram_access(adev, pos, &bhdr, sizeof(bhdr), false);
+
+   offset = le16_to_cpu(bhdr.table_list[NPS_INFO].offset);
+   checksum = le16_to_cpu(bhdr.table_list[NPS_INFO].checksum);
+
+   amdgpu_device_vram_access(adev, (pos + offset), nps_data,
+ sizeof(*nps_data), false);
+
+   nhdr = (struct nps_info_header *)(nps_data);
+   if (!amdgpu_discovery_verify_checksum((uint8_t *)nps_data,
+ le32_to_cpu(nhdr->size_bytes),
+ checksum)) {
+   dev_err(adev->dev, "nps data refresh, checksum mismatch\n");
+   return -EINVAL;
+   }
+
+   return 0;
+}
+
 int amdgpu_discovery_get_nps_info(struct amdgpu_device *adev,
  uint32_t *nps_type,
  struct amdgpu_gmc_memrange **ranges,
- int *range_cnt)
+ int *range_cnt, bool refresh)
 {
struct amdgpu_gmc_memrange *mem_ranges;
struct binary_header *bhdr;
union nps_info *nps_info;
+   union nps_info nps_data;
u16 offset;
-   int i;
+   int i, r;
 
if (!nps_type || !range_cnt || !ranges)
return -EINVAL;
 
-   if (!adev->mman.discovery_bin) {
-   dev_err(adev->dev,
-   "fetch mem range failed, ip discovery uninitialized\n");
-   return -EINVAL;
-   }
+   if (refresh) {
+   r = amdgpu_discovery_refresh_nps_info(adev, &nps_data);
+   if (r)
+   return r;
+   nps_info = &nps_data;
+   } else {
+   if (!adev->mman.discovery_bin) {
+   dev_err(adev->dev,
+   "fetch mem range failed, ip discovery 
uninitialized\n");
+   return -EINVAL;
+   }
 
-   bhdr = (struct binary_header *)adev->mman.discovery_bin;
-   offset = le16_to_cpu(bhdr->table_list[NPS_INFO].offset);
+   bhdr = (struct binary_header *)adev->mman.discovery_bin;
+   offset = le16_to_cpu(bhdr->table_list[NPS_INFO].offset);
 
-   if (!offset)
-   return -ENOENT;
+   if (!offset)
+   return -ENOENT;
 
-   /* If verification fails, return as if NPS table doesn't exist */
-   if (amdgpu_discovery_verify_npsinfo(adev, bhdr))
-   return -ENOENT;
+   /* If verification fails, return as if NPS table doesn't exist 
*/
+   if (amdgpu_discovery_verify_npsinfo(adev, bhdr))
+   return -ENOENT;
 
-   nps_info = (union nps_info *)(adev->mman.discovery_bin + offset);
+   nps_info =
+   (union nps_info *)(adev->mman.discovery_bin + offset);
+   }
 
switch (le16_to_cpu(nps_info->v1.header.version_major)) {
case 1:
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.h
index f5d36525ec3e..b44d56465c5b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.h
@@ -33,6 +33,6 @@ int amdgpu_discovery_set_ip_blocks(struct amdgpu_device 
*adev);
 int amdgpu_discovery_get_nps_info(struct amdgpu_device *adev,
  uint32_t *nps_type,
  struct amdgpu_gmc_memrange **ranges,
- int *range_cnt);
+ int *range_cnt, bool refresh);
 
 #endif /* __AMDGPU_DISCOVERY__ */
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
index 17a19d49d30a..4f088a5368d8 100644
-

[PATCH v2 5/7] drm/amdgpu: Place NPS mode request on unload

2024-09-26 Thread Lijo Lazar
If a user has requested NPS mode switch, place the request through PSP
during unload of the driver. For devices which are part of a hive, all
requests are placed together. If one of them fails, revert back to the
current NPS mode.

Signed-off-by: Lijo Lazar 
Signed-off-by: Rajneesh Bhardwaj 
Reviewed-by: Feifei Xu 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c  |  1 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c  | 47 
 drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h  |  2 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c | 38 +++
 drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h |  4 ++
 5 files changed, 92 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index 95331294509c..d16cdcdb2114 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -2428,6 +2428,7 @@ amdgpu_pci_remove(struct pci_dev *pdev)
struct amdgpu_device *adev = drm_to_adev(dev);
 
amdgpu_xcp_dev_unplug(adev);
+   amdgpu_gmc_prepare_nps_mode_change(adev);
drm_dev_unplug(dev);
 
if (adev->pm.rpm_mode != AMDGPU_RUNPM_NONE) {
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
index 24a1f931d9ed..21f1e65c9dc9 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
@@ -1345,3 +1345,50 @@ int amdgpu_gmc_request_memory_partition(struct 
amdgpu_device *adev,
 
return psp_memory_partition(&adev->psp, nps_mode);
 }
+
+static inline bool amdgpu_gmc_need_nps_switch_req(struct amdgpu_device *adev,
+ int req_nps_mode,
+ int cur_nps_mode)
+{
+   return (((BIT(req_nps_mode) & adev->gmc.supported_nps_modes) ==
+   BIT(req_nps_mode)) &&
+   req_nps_mode != cur_nps_mode);
+}
+
+void amdgpu_gmc_prepare_nps_mode_change(struct amdgpu_device *adev)
+{
+   int req_nps_mode, cur_nps_mode, r;
+   struct amdgpu_hive_info *hive;
+
+   if (amdgpu_sriov_vf(adev) || !adev->gmc.supported_nps_modes ||
+   !adev->gmc.gmc_funcs->request_mem_partition_mode)
+   return;
+
+   cur_nps_mode = adev->gmc.gmc_funcs->query_mem_partition_mode(adev);
+   hive = amdgpu_get_xgmi_hive(adev);
+   if (hive) {
+   req_nps_mode = atomic_read(&hive->requested_nps_mode);
+   if (!amdgpu_gmc_need_nps_switch_req(adev, req_nps_mode,
+   cur_nps_mode)) {
+   amdgpu_put_xgmi_hive(hive);
+   return;
+   }
+   r = amdgpu_xgmi_request_nps_change(adev, hive, req_nps_mode);
+   amdgpu_put_xgmi_hive(hive);
+   goto out;
+   }
+
+   req_nps_mode = adev->gmc.requested_nps_mode;
+   if (!amdgpu_gmc_need_nps_switch_req(adev, req_nps_mode, cur_nps_mode))
+   return;
+
+   /* even if this fails, we should let driver unload w/o blocking */
+   r = adev->gmc.gmc_funcs->request_mem_partition_mode(adev, req_nps_mode);
+out:
+   if (r)
+   dev_err(adev->dev, "NPS mode change request failed\n");
+   else
+   dev_info(
+   adev->dev,
+   "NPS mode change request done, reload driver to 
complete the change\n");
+}
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h
index 1a50639a003a..b13d6adb5efd 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h
@@ -467,4 +467,6 @@ int amdgpu_gmc_get_nps_memranges(struct amdgpu_device *adev,
 
 int amdgpu_gmc_request_memory_partition(struct amdgpu_device *adev,
int nps_mode);
+void amdgpu_gmc_prepare_nps_mode_change(struct amdgpu_device *adev);
+
 #endif
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c
index 5d721ccb9dfd..db2c1b11b813 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c
@@ -1564,3 +1564,41 @@ int amdgpu_xgmi_reset_on_init(struct amdgpu_device *adev)
 
return 0;
 }
+
+int amdgpu_xgmi_request_nps_change(struct amdgpu_device *adev,
+  struct amdgpu_hive_info *hive,
+  int req_nps_mode)
+{
+   struct amdgpu_device *tmp_adev;
+   int cur_nps_mode, r;
+
+   /* This is expected to be called only during unload of driver. The
+* request needs to be placed only once for all devices in the hive. If
+* one of them fail, revert the request for previous successful devices.
+* After placing the request, make hive mode as UNKNOWN so that other
+* devices don't request anymore.
+*/
+   mutex_lock(&hive->hive_lock);
+   list_for_each_entry(tmp_adev

[PATCH v2 6/7] drm/amdgpu: Check gmc requirement for reset on init

2024-09-26 Thread Lijo Lazar
Add a callback to check if there is any condition detected by GMC block
for reset on init. One case is if a pending NPS change request is
detected. If reset is done because of NPS switch, refresh NPS info from
discovery table.

Signed-off-by: Lijo Lazar 
---
v2:
Move NPS request check ahead of TOS reload requirement check (Feifei)

 drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 13 -
 drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h |  5 +
 drivers/gpu/drm/amd/amdgpu/soc15.c  |  2 ++
 3 files changed, 19 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
index 21f1e65c9dc9..011fe3a847d0 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
@@ -1261,12 +1261,15 @@ int amdgpu_gmc_get_nps_memranges(struct amdgpu_device 
*adev,
struct amdgpu_gmc_memrange *ranges;
int range_cnt, ret, i, j;
uint32_t nps_type;
+   bool refresh;
 
if (!mem_ranges)
return -EINVAL;
 
+   refresh = (adev->init_lvl->level != AMDGPU_INIT_LEVEL_MINIMAL_XGMI) &&
+ (adev->gmc.reset_flags & AMDGPU_GMC_INIT_RESET_NPS);
ret = amdgpu_discovery_get_nps_info(adev, &nps_type, &ranges,
-   &range_cnt, false);
+   &range_cnt, refresh);
 
if (ret)
return ret;
@@ -1392,3 +1395,11 @@ void amdgpu_gmc_prepare_nps_mode_change(struct 
amdgpu_device *adev)
adev->dev,
"NPS mode change request done, reload driver to 
complete the change\n");
 }
+
+bool amdgpu_gmc_need_reset_on_init(struct amdgpu_device *adev)
+{
+   if (adev->gmc.gmc_funcs->need_reset_on_init)
+   return adev->gmc.gmc_funcs->need_reset_on_init(adev);
+
+   return false;
+}
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h
index b13d6adb5efd..d4cd247fe574 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h
@@ -78,6 +78,8 @@ enum amdgpu_memory_partition {
 BIT(AMDGPU_NPS3_PARTITION_MODE) | BIT(AMDGPU_NPS4_PARTITION_MODE) | \
 BIT(AMDGPU_NPS6_PARTITION_MODE) | BIT(AMDGPU_NPS8_PARTITION_MODE))
 
+#define AMDGPU_GMC_INIT_RESET_NPS  BIT(0)
+
 /*
  * GMC page fault information
  */
@@ -169,6 +171,7 @@ struct amdgpu_gmc_funcs {
/* Request NPS mode */
int (*request_mem_partition_mode)(struct amdgpu_device *adev,
  int nps_mode);
+   bool (*need_reset_on_init)(struct amdgpu_device *adev);
 };
 
 struct amdgpu_xgmi_ras {
@@ -314,6 +317,7 @@ struct amdgpu_gmc {
const struct amdgpu_gmc_funcs   *gmc_funcs;
enum amdgpu_memory_partitionrequested_nps_mode;
uint32_t supported_nps_modes;
+   uint32_t reset_flags;
 
struct amdgpu_xgmi xgmi;
struct amdgpu_irq_src   ecc_irq;
@@ -468,5 +472,6 @@ int amdgpu_gmc_get_nps_memranges(struct amdgpu_device *adev,
 int amdgpu_gmc_request_memory_partition(struct amdgpu_device *adev,
int nps_mode);
 void amdgpu_gmc_prepare_nps_mode_change(struct amdgpu_device *adev);
+bool amdgpu_gmc_need_reset_on_init(struct amdgpu_device *adev);
 
 #endif
diff --git a/drivers/gpu/drm/amd/amdgpu/soc15.c 
b/drivers/gpu/drm/amd/amdgpu/soc15.c
index 619933f252aa..6c0913f0d296 100644
--- a/drivers/gpu/drm/amd/amdgpu/soc15.c
+++ b/drivers/gpu/drm/amd/amdgpu/soc15.c
@@ -831,6 +831,8 @@ static bool soc15_need_reset_on_init(struct amdgpu_device 
*adev)
if (adev->asic_type == CHIP_RENOIR)
return true;
 
+   if (amdgpu_gmc_need_reset_on_init(adev))
+   return true;
if (amdgpu_psp_tos_reload_needed(adev))
return true;
/* Just return false for soc15 GPUs.  Reset does not seem to
-- 
2.25.1



[PATCH v2 4/7] drm/amdgpu: Add sysfs interfaces for NPS mode

2024-09-26 Thread Lijo Lazar
Add a sysfs interface to see available NPS modes to switch to -

cat /sys/bus/pci/devices/../available_memory_paritition

Make the current_memory_partition sysfs node read/write for requesting a
new NPS mode. The request is only cached and at a later point a driver
unload/reload is required to switch to the new NPS mode.

Ex:
echo NPS1 > /sys/bus/pci/devices/../current_memory_paritition
echo NPS4 > /sys/bus/pci/devices/../current_memory_paritition

The above interfaces will be available only if the SOC supports more than
one NPS mode.

Also modify the current memory partition sysfs logic to be more
generic.

Signed-off-by: Lijo Lazar 
Reviewed-by: Rajneesh Bhardwaj 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 114 
 drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h |   6 ++
 2 files changed, 104 insertions(+), 16 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
index 758fda4e628f..24a1f931d9ed 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
@@ -1130,6 +1130,79 @@ int amdgpu_gmc_vram_checking(struct amdgpu_device *adev)
return ret;
 }
 
+static const char *nps_desc[] = {
+   [AMDGPU_NPS1_PARTITION_MODE] = "NPS1",
+   [AMDGPU_NPS2_PARTITION_MODE] = "NPS2",
+   [AMDGPU_NPS3_PARTITION_MODE] = "NPS3",
+   [AMDGPU_NPS4_PARTITION_MODE] = "NPS4",
+   [AMDGPU_NPS6_PARTITION_MODE] = "NPS6",
+   [AMDGPU_NPS8_PARTITION_MODE] = "NPS8",
+};
+
+static ssize_t available_memory_partition_show(struct device *dev,
+  struct device_attribute *addr,
+  char *buf)
+{
+   struct drm_device *ddev = dev_get_drvdata(dev);
+   struct amdgpu_device *adev = drm_to_adev(ddev);
+   int size = 0, mode;
+   char *sep = "";
+
+   for_each_inst(mode, adev->gmc.supported_nps_modes) {
+   size += sysfs_emit_at(buf, size, "%s%s", sep, nps_desc[mode]);
+   sep = ", ";
+   }
+   size += sysfs_emit_at(buf, size, "\n");
+
+   return size;
+}
+
+static ssize_t current_memory_partition_store(struct device *dev,
+ struct device_attribute *attr,
+ const char *buf, size_t count)
+{
+   struct drm_device *ddev = dev_get_drvdata(dev);
+   struct amdgpu_device *adev = drm_to_adev(ddev);
+   enum amdgpu_memory_partition mode;
+   struct amdgpu_hive_info *hive;
+   int i;
+
+   mode = UNKNOWN_MEMORY_PARTITION_MODE;
+   for_each_inst(i, adev->gmc.supported_nps_modes) {
+   if (!strncasecmp(nps_desc[i], buf, strlen(nps_desc[i]))) {
+   mode = i;
+   break;
+   }
+   }
+
+   if (mode == UNKNOWN_MEMORY_PARTITION_MODE)
+   return -EINVAL;
+
+   if (mode == adev->gmc.gmc_funcs->query_mem_partition_mode(adev)) {
+   dev_info(
+   adev->dev,
+   "requested NPS mode is same as current NPS mode, 
skipping\n");
+   return count;
+   }
+
+   /* If device is part of hive, all devices in the hive should request the
+* same mode. Hence store the requested mode in hive.
+*/
+   hive = amdgpu_get_xgmi_hive(adev);
+   if (hive) {
+   atomic_set(&hive->requested_nps_mode, mode);
+   amdgpu_put_xgmi_hive(hive);
+   } else {
+   adev->gmc.requested_nps_mode = mode;
+   }
+
+   dev_info(
+   adev->dev,
+   "NPS mode change requested, please remove and reload the 
driver\n");
+
+   return count;
+}
+
 static ssize_t current_memory_partition_show(
struct device *dev, struct device_attribute *addr, char *buf)
 {
@@ -1138,38 +1211,47 @@ static ssize_t current_memory_partition_show(
enum amdgpu_memory_partition mode;
 
mode = adev->gmc.gmc_funcs->query_mem_partition_mode(adev);
-   switch (mode) {
-   case AMDGPU_NPS1_PARTITION_MODE:
-   return sysfs_emit(buf, "NPS1\n");
-   case AMDGPU_NPS2_PARTITION_MODE:
-   return sysfs_emit(buf, "NPS2\n");
-   case AMDGPU_NPS3_PARTITION_MODE:
-   return sysfs_emit(buf, "NPS3\n");
-   case AMDGPU_NPS4_PARTITION_MODE:
-   return sysfs_emit(buf, "NPS4\n");
-   case AMDGPU_NPS6_PARTITION_MODE:
-   return sysfs_emit(buf, "NPS6\n");
-   case AMDGPU_NPS8_PARTITION_MODE:
-   return sysfs_emit(buf, "NPS8\n");
-   default:
+   if ((mode > ARRAY_SIZE(nps_desc)) ||
+   (BIT(mode) & AMDGPU_ALL_NPS_MASK) != BIT(mode))
return sysfs_emit(buf, "UNKNOWN\n");
-   }
+
+   return sysfs_emit(buf, "%s\n", nps_desc[mode]);
 }
 
-static DEVICE_ATTR_RO(current_memory_partition);
+static DEVICE_ATTR_RW(current_memory_par

[PATCH v2 2/7] drm/amdgpu: Add PSP interface for NPS switch

2024-09-26 Thread Lijo Lazar
Implement PSP ring command interface for memory partitioning on the fly
on the supported asics.

Signed-off-by: Rajneesh Bhardwaj 
Reviewed-by: Feifei Xu 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 25 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h |  1 +
 drivers/gpu/drm/amd/amdgpu/psp_gfx_if.h | 14 +++---
 3 files changed, 37 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
index 944dad9ad29f..04be0fabb4f5 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
@@ -1045,6 +1045,31 @@ static int psp_rl_load(struct amdgpu_device *adev)
return ret;
 }
 
+int psp_memory_partition(struct psp_context *psp, int mode)
+{
+   struct psp_gfx_cmd_resp *cmd;
+   int ret;
+
+   if (amdgpu_sriov_vf(psp->adev))
+   return 0;
+
+   cmd = acquire_psp_cmd_buf(psp);
+
+   cmd->cmd_id = GFX_CMD_ID_FB_NPS_MODE;
+   cmd->cmd.cmd_memory_part.mode = mode;
+
+   dev_info(psp->adev->dev,
+"Requesting %d memory partition change through PSP", mode);
+   ret = psp_cmd_submit_buf(psp, NULL, cmd, psp->fence_buf_mc_addr);
+   if (ret)
+   dev_err(psp->adev->dev,
+   "PSP request failed to change to NPS%d mode\n", mode);
+
+   release_psp_cmd_buf(psp);
+
+   return ret;
+}
+
 int psp_spatial_partition(struct psp_context *psp, int mode)
 {
struct psp_gfx_cmd_resp *cmd;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
index 76fa18ffc045..567cb1f924ca 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
@@ -553,6 +553,7 @@ int psp_load_fw_list(struct psp_context *psp,
 void psp_copy_fw(struct psp_context *psp, uint8_t *start_addr, uint32_t 
bin_size);
 
 int psp_spatial_partition(struct psp_context *psp, int mode);
+int psp_memory_partition(struct psp_context *psp, int mode);
 
 int is_psp_fw_valid(struct psp_bin_desc bin);
 
diff --git a/drivers/gpu/drm/amd/amdgpu/psp_gfx_if.h 
b/drivers/gpu/drm/amd/amdgpu/psp_gfx_if.h
index 604301371e4f..f4a91b126c73 100644
--- a/drivers/gpu/drm/amd/amdgpu/psp_gfx_if.h
+++ b/drivers/gpu/drm/amd/amdgpu/psp_gfx_if.h
@@ -103,8 +103,10 @@ enum psp_gfx_cmd_id
 GFX_CMD_ID_AUTOLOAD_RLC   = 0x0021,   /* Indicates all graphics fw 
loaded, start RLC autoload */
 GFX_CMD_ID_BOOT_CFG   = 0x0022,   /* Boot Config */
 GFX_CMD_ID_SRIOV_SPATIAL_PART = 0x0027,   /* Configure spatial 
partitioning mode */
-   /*IDs of performance monitoring/profiling*/
-   GFX_CMD_ID_CONFIG_SQ_PERFMON = 0x0046,   /* Config CGTT_SQ_CLK_CTRL 
*/
+/*IDs of performance monitoring/profiling*/
+GFX_CMD_ID_CONFIG_SQ_PERFMON  = 0x0046,   /* Config CGTT_SQ_CLK_CTRL */
+/* Dynamic memory partitioninig (NPS mode change)*/
+GFX_CMD_ID_FB_NPS_MODE = 0x0048,  /* Configure memory 
partitioning mode */
 };
 
 /* PSP boot config sub-commands */
@@ -362,6 +364,11 @@ struct psp_gfx_cmd_config_sq_perfmon {
uint8_t reserved[5];
 };
 
+struct psp_gfx_cmd_fb_memory_part {
+   uint32_t mode; /* requested NPS mode */
+   uint32_t resvd;
+};
+
 /* All GFX ring buffer commands. */
 union psp_gfx_commands
 {
@@ -376,7 +383,8 @@ union psp_gfx_commands
 struct psp_gfx_cmd_load_toc cmd_load_toc;
 struct psp_gfx_cmd_boot_cfg boot_cfg;
 struct psp_gfx_cmd_sriov_spatial_part cmd_spatial_part;
-   struct psp_gfx_cmd_config_sq_perfmon config_sq_perfmon;
+struct psp_gfx_cmd_config_sq_perfmon config_sq_perfmon;
+struct psp_gfx_cmd_fb_memory_part cmd_memory_part;
 };
 
 struct psp_gfx_uresp_reserved
-- 
2.25.1



[PATCH v2 0/7] Add support for dynamic NPS switch

2024-09-26 Thread Lijo Lazar
This series adds supports for dynamic NPS switch on GC v9.4.3/9.4.4 SOC
variants. 

In order to do dynamic NPS switch a sysfs interface is provided to request a new
NPS mode. If the device is part of a hive, all hive devices are required to be
in the same NPS mode. Hence a hive device request is saved in a hive variable.
For individual device, it's saved in a gmc block variable.

In order to do a NPS mode switch, the workflow is -

1) User places a requests through sysfs node.
2) User unloads the driver
3) During unload, driver checks for any pending NPS switch request. If any
request is pending, it places the request to PSP FW. 
4) For a hive, request is placed in one-go for all devices in the hive. If
one of the requests fails, a request is placed again to revert to current NPS
mode on the successful devices.
5) User reloads the driver.
6) On reload, driver checks if NPS switch is pending and initiates a mode-1
reset.
7) During resume after a reset, NPS ranges are read again from discovery table.
8) Driver detects the new NPS mode and makes a compatible compute partition mode
switch if required.

v2:
Move NPS request check ahead of TOS reload requirement check (Feifei)

Lijo Lazar (7):
  drm/amdgpu: Add option to refresh NPS data
  drm/amdgpu: Add PSP interface for NPS switch
  drm/amdgpu: Add gmc interface to request NPS mode
  drm/amdgpu: Add sysfs interfaces for NPS mode
  drm/amdgpu: Place NPS mode request on unload
  drm/amdgpu: Check gmc requirement for reset on init
  drm/amdgpu: Add NPS switch support for GC 9.4.3

 drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c |  68 +--
 drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.h |   2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c   |   1 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c   | 190 --
 drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h   |  19 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_nbio.h  |   1 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c   |  25 +++
 drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h   |   1 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c  |  39 
 drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h  |   5 +
 drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c |  44 
 drivers/gpu/drm/amd/amdgpu/nbio_v7_9.c|  12 ++
 drivers/gpu/drm/amd/amdgpu/psp_gfx_if.h   |  14 +-
 drivers/gpu/drm/amd/amdgpu/soc15.c|   2 +
 14 files changed, 387 insertions(+), 36 deletions(-)

-- 
2.25.1



[PATCH v2 1/7] drm/amdgpu: Add option to refresh NPS data

2024-09-26 Thread Lijo Lazar
In certain use cases, NPS data needs to be refreshed again from
discovery table. Add API parameter to refresh NPS data from discovery
table.

Signed-off-by: Lijo Lazar 
Reviewed-by: Rajneesh Bhardwaj 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c | 68 +++
 drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.h |  2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c   |  2 +-
 3 files changed, 55 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
index 4bd61c169ca8..9f9a1867da72 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
@@ -1723,37 +1723,75 @@ union nps_info {
struct nps_info_v1_0 v1;
 };
 
+static int amdgpu_discovery_refresh_nps_info(struct amdgpu_device *adev,
+union nps_info *nps_data)
+{
+   uint64_t vram_size, pos, offset;
+   struct nps_info_header *nhdr;
+   struct binary_header bhdr;
+   uint16_t checksum;
+
+   vram_size = (uint64_t)RREG32(mmRCC_CONFIG_MEMSIZE) << 20;
+   pos = vram_size - DISCOVERY_TMR_OFFSET;
+   amdgpu_device_vram_access(adev, pos, &bhdr, sizeof(bhdr), false);
+
+   offset = le16_to_cpu(bhdr.table_list[NPS_INFO].offset);
+   checksum = le16_to_cpu(bhdr.table_list[NPS_INFO].checksum);
+
+   amdgpu_device_vram_access(adev, (pos + offset), nps_data,
+ sizeof(*nps_data), false);
+
+   nhdr = (struct nps_info_header *)(nps_data);
+   if (!amdgpu_discovery_verify_checksum((uint8_t *)nps_data,
+ le32_to_cpu(nhdr->size_bytes),
+ checksum)) {
+   dev_err(adev->dev, "nps data refresh, checksum mismatch\n");
+   return -EINVAL;
+   }
+
+   return 0;
+}
+
 int amdgpu_discovery_get_nps_info(struct amdgpu_device *adev,
  uint32_t *nps_type,
  struct amdgpu_gmc_memrange **ranges,
- int *range_cnt)
+ int *range_cnt, bool refresh)
 {
struct amdgpu_gmc_memrange *mem_ranges;
struct binary_header *bhdr;
union nps_info *nps_info;
+   union nps_info nps_data;
u16 offset;
-   int i;
+   int i, r;
 
if (!nps_type || !range_cnt || !ranges)
return -EINVAL;
 
-   if (!adev->mman.discovery_bin) {
-   dev_err(adev->dev,
-   "fetch mem range failed, ip discovery uninitialized\n");
-   return -EINVAL;
-   }
+   if (refresh) {
+   r = amdgpu_discovery_refresh_nps_info(adev, &nps_data);
+   if (r)
+   return r;
+   nps_info = &nps_data;
+   } else {
+   if (!adev->mman.discovery_bin) {
+   dev_err(adev->dev,
+   "fetch mem range failed, ip discovery 
uninitialized\n");
+   return -EINVAL;
+   }
 
-   bhdr = (struct binary_header *)adev->mman.discovery_bin;
-   offset = le16_to_cpu(bhdr->table_list[NPS_INFO].offset);
+   bhdr = (struct binary_header *)adev->mman.discovery_bin;
+   offset = le16_to_cpu(bhdr->table_list[NPS_INFO].offset);
 
-   if (!offset)
-   return -ENOENT;
+   if (!offset)
+   return -ENOENT;
 
-   /* If verification fails, return as if NPS table doesn't exist */
-   if (amdgpu_discovery_verify_npsinfo(adev, bhdr))
-   return -ENOENT;
+   /* If verification fails, return as if NPS table doesn't exist 
*/
+   if (amdgpu_discovery_verify_npsinfo(adev, bhdr))
+   return -ENOENT;
 
-   nps_info = (union nps_info *)(adev->mman.discovery_bin + offset);
+   nps_info =
+   (union nps_info *)(adev->mman.discovery_bin + offset);
+   }
 
switch (le16_to_cpu(nps_info->v1.header.version_major)) {
case 1:
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.h
index f5d36525ec3e..b44d56465c5b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.h
@@ -33,6 +33,6 @@ int amdgpu_discovery_set_ip_blocks(struct amdgpu_device 
*adev);
 int amdgpu_discovery_get_nps_info(struct amdgpu_device *adev,
  uint32_t *nps_type,
  struct amdgpu_gmc_memrange **ranges,
- int *range_cnt);
+ int *range_cnt, bool refresh);
 
 #endif /* __AMDGPU_DISCOVERY__ */
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
index 17a19d49d30a..4f088a5368d8 100644
-

[PATCH v2 7/7] drm/amdgpu: Add NPS switch support for GC 9.4.3

2024-09-26 Thread Lijo Lazar
Add dynamic NPS switch support for GC 9.4.3 variants. Only GC v9.4.3 and
GC v9.4.4 currently support this. NPS switch is only supported if an SOC
supports multiple NPS modes.

Signed-off-by: Lijo Lazar 
Signed-off-by: Rajneesh Bhardwaj 
Reviewed-by: Feifei Xu 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_nbio.h |  1 +
 drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c| 44 
 drivers/gpu/drm/amd/amdgpu/nbio_v7_9.c   | 12 +++
 3 files changed, 57 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_nbio.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_nbio.h
index f61d117b0caf..79c2f807b9fe 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_nbio.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_nbio.h
@@ -101,6 +101,7 @@ struct amdgpu_nbio_funcs {
int (*get_compute_partition_mode)(struct amdgpu_device *adev);
u32 (*get_memory_partition_mode)(struct amdgpu_device *adev,
 u32 *supp_modes);
+   bool (*is_nps_switch_requested)(struct amdgpu_device *adev);
u64 (*get_pcie_replay_count)(struct amdgpu_device *adev);
void (*set_reg_remap)(struct amdgpu_device *adev);
 };
diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c 
b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
index cafcb24449df..6a95402985ef 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
@@ -1395,6 +1395,17 @@ gmc_v9_0_query_memory_partition(struct amdgpu_device 
*adev)
return gmc_v9_0_get_memory_partition(adev, NULL);
 }
 
+static bool gmc_v9_0_need_reset_on_init(struct amdgpu_device *adev)
+{
+   if (adev->nbio.funcs &&
+   adev->nbio.funcs->is_nps_switch_requested(adev)) {
+   adev->gmc.reset_flags |= AMDGPU_GMC_INIT_RESET_NPS;
+   return true;
+   }
+
+   return false;
+}
+
 static const struct amdgpu_gmc_funcs gmc_v9_0_gmc_funcs = {
.flush_gpu_tlb = gmc_v9_0_flush_gpu_tlb,
.flush_gpu_tlb_pasid = gmc_v9_0_flush_gpu_tlb_pasid,
@@ -1406,6 +1417,8 @@ static const struct amdgpu_gmc_funcs gmc_v9_0_gmc_funcs = 
{
.override_vm_pte_flags = gmc_v9_0_override_vm_pte_flags,
.get_vbios_fb_size = gmc_v9_0_get_vbios_fb_size,
.query_mem_partition_mode = &gmc_v9_0_query_memory_partition,
+   .request_mem_partition_mode = &amdgpu_gmc_request_memory_partition,
+   .need_reset_on_init = &gmc_v9_0_need_reset_on_init,
 };
 
 static void gmc_v9_0_set_gmc_funcs(struct amdgpu_device *adev)
@@ -1545,6 +1558,28 @@ static void gmc_v9_0_set_xgmi_ras_funcs(struct 
amdgpu_device *adev)
adev->gmc.xgmi.ras = &xgmi_ras;
 }
 
+static void gmc_v9_0_init_nps_details(struct amdgpu_device *adev)
+{
+   adev->gmc.supported_nps_modes = 0;
+
+   if (amdgpu_sriov_vf(adev) || (adev->flags & AMD_IS_APU))
+   return;
+
+   /*TODO: Check PSP version also which supports NPS switch. Otherwise keep
+* supported modes as 0.
+*/
+   switch (amdgpu_ip_version(adev, GC_HWIP, 0)) {
+   case IP_VERSION(9, 4, 3):
+   case IP_VERSION(9, 4, 4):
+   adev->gmc.supported_nps_modes =
+   BIT(AMDGPU_NPS1_PARTITION_MODE) |
+   BIT(AMDGPU_NPS4_PARTITION_MODE);
+   break;
+   default:
+   break;
+   }
+}
+
 static int gmc_v9_0_early_init(void *handle)
 {
struct amdgpu_device *adev = (struct amdgpu_device *)handle;
@@ -2165,6 +2200,7 @@ static int gmc_v9_0_sw_init(void *handle)
if (r)
return r;
 
+   gmc_v9_0_init_nps_details(adev);
/*
 * number of VMs
 * VMID 0 is reserved for System
@@ -2440,6 +2476,14 @@ static int gmc_v9_0_resume(void *handle)
int r;
struct amdgpu_device *adev = (struct amdgpu_device *)handle;
 
+   /* If a reset is done for NPS mode switch, read the memory range
+* information again.
+*/
+   if (adev->gmc.reset_flags & AMDGPU_GMC_INIT_RESET_NPS) {
+   gmc_v9_0_init_sw_mem_ranges(adev, adev->gmc.mem_partitions);
+   adev->gmc.reset_flags &= ~AMDGPU_GMC_INIT_RESET_NPS;
+   }
+
r = gmc_v9_0_hw_init(adev);
if (r)
return r;
diff --git a/drivers/gpu/drm/amd/amdgpu/nbio_v7_9.c 
b/drivers/gpu/drm/amd/amdgpu/nbio_v7_9.c
index d1bd79bbae53..8a0a63ac88d2 100644
--- a/drivers/gpu/drm/amd/amdgpu/nbio_v7_9.c
+++ b/drivers/gpu/drm/amd/amdgpu/nbio_v7_9.c
@@ -401,6 +401,17 @@ static int nbio_v7_9_get_compute_partition_mode(struct 
amdgpu_device *adev)
return px;
 }
 
+static bool nbio_v7_9_is_nps_switch_requested(struct amdgpu_device *adev)
+{
+   u32 tmp;
+
+   tmp = RREG32_SOC15(NBIO, 0, regBIF_BX_PF0_PARTITION_MEM_STATUS);
+   tmp = REG_GET_FIELD(tmp, BIF_BX_PF0_PARTITION_MEM_STATUS,
+   CHANGE_STATUE);
+
+   /* 0x8 - NPS switch requested */
+   return (tmp == 0x8);
+}
 static u32 nbio_v7_9_get_memory_partition_mode(struct amdgpu_device *adev

[PATCH v2 3/7] drm/amdgpu: Add gmc interface to request NPS mode

2024-09-26 Thread Lijo Lazar
Add a common interface in GMC to request NPS mode through PSP. Also add
a variable in hive and gmc control to track the last requested mode.

Signed-off-by: Rajneesh Bhardwaj 
Signed-off-by: Lijo Lazar 
Reviewed-by: Feifei Xu 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c  | 16 
 drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h  |  6 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c |  1 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h |  1 +
 4 files changed, 24 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
index 4f088a5368d8..758fda4e628f 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
@@ -1247,3 +1247,19 @@ int amdgpu_gmc_get_nps_memranges(struct amdgpu_device 
*adev,
 
return ret;
 }
+
+int amdgpu_gmc_request_memory_partition(struct amdgpu_device *adev,
+   int nps_mode)
+{
+   /* Not supported on VF devices and APUs */
+   if (amdgpu_sriov_vf(adev) || (adev->flags & AMD_IS_APU))
+   return -EOPNOTSUPP;
+
+   if (!adev->psp.funcs) {
+   dev_err(adev->dev,
+   "PSP interface not available for nps mode change 
request");
+   return -EINVAL;
+   }
+
+   return psp_memory_partition(&adev->psp, nps_mode);
+}
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h
index 33b2adffd58b..f5be5112b742 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h
@@ -161,6 +161,9 @@ struct amdgpu_gmc_funcs {
 
enum amdgpu_memory_partition (*query_mem_partition_mode)(
struct amdgpu_device *adev);
+   /* Request NPS mode */
+   int (*request_mem_partition_mode)(struct amdgpu_device *adev,
+ int nps_mode);
 };
 
 struct amdgpu_xgmi_ras {
@@ -304,6 +307,7 @@ struct amdgpu_gmc {
struct amdgpu_mem_partition_info *mem_partitions;
uint8_t num_mem_partitions;
const struct amdgpu_gmc_funcs   *gmc_funcs;
+   enum amdgpu_memory_partitionrequested_nps_mode;
 
struct amdgpu_xgmi xgmi;
struct amdgpu_irq_src   ecc_irq;
@@ -455,4 +459,6 @@ int amdgpu_gmc_get_nps_memranges(struct amdgpu_device *adev,
 struct amdgpu_mem_partition_info *mem_ranges,
 int exp_ranges);
 
+int amdgpu_gmc_request_memory_partition(struct amdgpu_device *adev,
+   int nps_mode);
 #endif
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c
index b17e63c98a99..5d721ccb9dfd 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c
@@ -667,6 +667,7 @@ struct amdgpu_hive_info *amdgpu_get_xgmi_hive(struct 
amdgpu_device *adev)
task_barrier_init(&hive->tb);
hive->pstate = AMDGPU_XGMI_PSTATE_UNKNOWN;
hive->hi_req_gpu = NULL;
+   atomic_set(&hive->requested_nps_mode, UNKNOWN_MEMORY_PARTITION_MODE);
 
/*
 * hive pstate on boot is high in vega20 so we have to go to low
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h
index d652727ca565..67abadb4f298 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h
@@ -46,6 +46,7 @@ struct amdgpu_hive_info {
atomic_t ras_recovery;
struct ras_event_manager event_mgr;
struct work_struct reset_on_init_work;
+   atomic_t requested_nps_mode;
 };
 
 struct amdgpu_pcs_ras_field {
-- 
2.25.1



[PATCH v2] drm/amdgpu: simplify vram alloc logic since 2GB limitation removed

2024-09-26 Thread Yifan Zhang
This is a followup of commit b2dba064c9bd ("drm/amdgpu: Handle
sg size limit for contiguous allocation").

2GB limitation in VRAM allocation is removed in above commit.
The remaing_size calculation was to address the 2GB limitation
in contiguous VRAM allocation, thus no longer needed. Simplify
vram alloc logic.

Signed-off-by: Yifan Zhang 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c | 15 +--
 1 file changed, 5 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
index 7d26a962f811..3d129fd61fa7 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
@@ -455,7 +455,7 @@ static int amdgpu_vram_mgr_new(struct ttm_resource_manager 
*man,
struct amdgpu_bo *bo = ttm_to_amdgpu_bo(tbo);
u64 vis_usage = 0, max_bytes, min_block_size;
struct amdgpu_vram_mgr_resource *vres;
-   u64 size, remaining_size, lpfn, fpfn;
+   u64 size, lpfn, fpfn;
unsigned int adjust_dcc_size = 0;
struct drm_buddy *mm = &mgr->mm;
struct drm_buddy_block *block;
@@ -516,25 +516,23 @@ static int amdgpu_vram_mgr_new(struct 
ttm_resource_manager *man,
adev->gmc.gmc_funcs->get_dcc_alignment)
adjust_dcc_size = amdgpu_gmc_get_dcc_alignment(adev);
 
-   remaining_size = (u64)vres->base.size;
+   size = (u64)vres->base.size;
if (bo->flags & AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS && adjust_dcc_size) {
unsigned int dcc_size;
 
dcc_size = roundup_pow_of_two(vres->base.size + 
adjust_dcc_size);
-   remaining_size = (u64)dcc_size;
+   size = (u64)dcc_size;
 
vres->flags |= DRM_BUDDY_TRIM_DISABLE;
}
 
mutex_lock(&mgr->lock);
-   while (remaining_size) {
+   while (true) {
if (tbo->page_alignment)
min_block_size = (u64)tbo->page_alignment << PAGE_SHIFT;
else
min_block_size = mgr->default_page_size;
 
-   size = remaining_size;
-
if (bo->flags & AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS && 
adjust_dcc_size)
min_block_size = size;
else if ((size >= (u64)pages_per_block << PAGE_SHIFT) &&
@@ -562,10 +560,7 @@ static int amdgpu_vram_mgr_new(struct ttm_resource_manager 
*man,
if (unlikely(r))
goto error_free_blocks;
 
-   if (size > remaining_size)
-   remaining_size = 0;
-   else
-   remaining_size -= size;
+   break;
}
mutex_unlock(&mgr->lock);
 
-- 
2.43.0