Re: [PATCH v2] docs/gpu: ci: update flake tests requirements
Hi Maxime, On 26/09/24 12:56, Maxime Ripard wrote: On Thu, Sep 26, 2024 at 12:36:49PM GMT, Vignesh Raman wrote: Update the documentation to require linking to a relevant GitLab issue for each new flake entry instead of an email report. Added specific GitLab issue URLs for i915, xe and other drivers. Signed-off-by: Vignesh Raman --- v2: - Add gitlab issue link for msm driver. --- Documentation/gpu/automated_testing.rst | 16 +++- 1 file changed, 11 insertions(+), 5 deletions(-) diff --git a/Documentation/gpu/automated_testing.rst b/Documentation/gpu/automated_testing.rst index 2d5a28866afe..f918fe56f2b0 100644 --- a/Documentation/gpu/automated_testing.rst +++ b/Documentation/gpu/automated_testing.rst @@ -67,20 +67,26 @@ Lists the tests that for a given driver on a specific hardware revision are known to behave unreliably. These tests won't cause a job to fail regardless of the result. They will still be run. -Each new flake entry must be associated with a link to the email reporting the -bug to the author of the affected driver, the board name or Device Tree name of -the board, the first kernel version affected, the IGT version used for tests, -and an approximation of the failure rate. +Each new flake entry must include a link to the relevant GitLab issue, the board +name or Device Tree name, the first kernel version affected, the IGT version used +for tests and an approximation of the failure rate. They should be provided under the following format:: - # Bug Report: $LORE_OR_PATCHWORK_URL + # Bug Report: $GITLAB_ISSUE # Board Name: broken-board.dtb # Linux Version: 6.6-rc1 # IGT Version: 1.28-gd2af13d9f # Failure Rate: 100 flaky-test +The GitLab issue must include the logs and the pipeline link. Use the appropriate +link below to create an issue. +https://gitlab.freedesktop.org/drm/i915/kernel/-/issues for i915 driver +https://gitlab.freedesktop.org/drm/xe/kernel/-/issues for xe driver +https://gitlab.freedesktop.org/drm/msm/-/issues for msm driver +https://gitlab.freedesktop.org/drm/misc/kernel/-/issues for other drivers + I can't comment for the others, but drm-misc at least still requires reporting issues by mail, so, no, sorry, we can't switch to gitlab only for now. In https://gitlab.freedesktop.org/drm/ we have xe, i915, msm, nouveau, and amd (only for issues). In drm-ci we test i915, msm and amd. So I will add GitLab links for these and for other drivers use email reporting. I will reword this section. Thanks. Regards, Vignesh
Re: [PATCH v1 4/9] drm/amd/pm: don't update runpm last_usage on debugfs getter
On 9/25/2024 1:24 PM, Pierre-Eric Pelloux-Prayer wrote: > Reading pm values from the GPU shouldn't prevent it to be suspended > by resetting the last active timestamp (eg: if an background app > monitors GPU sensors every second, it would prevent the autosuspend > sequence to trigger). > > Tested-by: Mario Limonciello > Signed-off-by: Pierre-Eric Pelloux-Prayer > --- > drivers/gpu/drm/amd/pm/amdgpu_pm.c | 26 -- > 1 file changed, 26 deletions(-) > > diff --git a/drivers/gpu/drm/amd/pm/amdgpu_pm.c > b/drivers/gpu/drm/amd/pm/amdgpu_pm.c > index 042a4dd1bd6a..c8f34b1a4d8e 100644 > --- a/drivers/gpu/drm/amd/pm/amdgpu_pm.c > +++ b/drivers/gpu/drm/amd/pm/amdgpu_pm.c > @@ -151,7 +151,6 @@ static ssize_t amdgpu_get_power_dpm_state(struct device > *dev, > > amdgpu_dpm_get_current_power_state(adev, &pm); > > - pm_runtime_mark_last_busy(ddev->dev); > pm_runtime_put_autosuspend(ddev->dev); > > return sysfs_emit(buf, "%s\n", > @@ -275,7 +274,6 @@ static ssize_t > amdgpu_get_power_dpm_force_performance_level(struct device *dev, > > level = amdgpu_dpm_get_performance_level(adev); > > - pm_runtime_mark_last_busy(ddev->dev); > pm_runtime_put_autosuspend(ddev->dev); > > return sysfs_emit(buf, "%s\n", > @@ -373,7 +371,6 @@ static ssize_t amdgpu_get_pp_num_states(struct device > *dev, > if (amdgpu_dpm_get_pp_num_states(adev, &data)) > memset(&data, 0, sizeof(data)); > > - pm_runtime_mark_last_busy(ddev->dev); > pm_runtime_put_autosuspend(ddev->dev); > > buf_len = sysfs_emit(buf, "states: %d\n", data.nums); > @@ -410,7 +407,6 @@ static ssize_t amdgpu_get_pp_cur_state(struct device *dev, > > ret = amdgpu_dpm_get_pp_num_states(adev, &data); > > - pm_runtime_mark_last_busy(ddev->dev); > pm_runtime_put_autosuspend(ddev->dev); > > if (ret) > @@ -536,7 +532,6 @@ static ssize_t amdgpu_get_pp_table(struct device *dev, > > size = amdgpu_dpm_get_pp_table(adev, &table); > > - pm_runtime_mark_last_busy(ddev->dev); > pm_runtime_put_autosuspend(ddev->dev); > > if (size <= 0) > @@ -866,7 +861,6 @@ static ssize_t amdgpu_get_pp_od_clk_voltage(struct device > *dev, > if (size == 0) > size = sysfs_emit(buf, "\n"); > > - pm_runtime_mark_last_busy(ddev->dev); > pm_runtime_put_autosuspend(ddev->dev); > > return size; > @@ -944,7 +938,6 @@ static ssize_t amdgpu_get_pp_features(struct device *dev, > if (size <= 0) > size = sysfs_emit(buf, "\n"); > > - pm_runtime_mark_last_busy(ddev->dev); > pm_runtime_put_autosuspend(ddev->dev); > > return size; > @@ -1014,7 +1007,6 @@ static ssize_t amdgpu_get_pp_dpm_clock(struct device > *dev, > if (size == 0) > size = sysfs_emit(buf, "\n"); > > - pm_runtime_mark_last_busy(ddev->dev); > pm_runtime_put_autosuspend(ddev->dev); > > return size; > @@ -1259,7 +1251,6 @@ static ssize_t amdgpu_get_pp_sclk_od(struct device *dev, > > value = amdgpu_dpm_get_sclk_od(adev); > > - pm_runtime_mark_last_busy(ddev->dev); > pm_runtime_put_autosuspend(ddev->dev); > > return sysfs_emit(buf, "%d\n", value); > @@ -1317,7 +1308,6 @@ static ssize_t amdgpu_get_pp_mclk_od(struct device *dev, > > value = amdgpu_dpm_get_mclk_od(adev); > > - pm_runtime_mark_last_busy(ddev->dev); > pm_runtime_put_autosuspend(ddev->dev); > > return sysfs_emit(buf, "%d\n", value); > @@ -1397,7 +1387,6 @@ static ssize_t amdgpu_get_pp_power_profile_mode(struct > device *dev, > if (size <= 0) > size = sysfs_emit(buf, "\n"); > > - pm_runtime_mark_last_busy(ddev->dev); > pm_runtime_put_autosuspend(ddev->dev); > > return size; > @@ -1485,7 +1474,6 @@ static int amdgpu_hwmon_get_sensor_generic(struct > amdgpu_device *adev, > /* get the sensor value */ > r = amdgpu_dpm_read_sensor(adev, sensor, query, &size); > > - pm_runtime_mark_last_busy(adev_to_drm(adev)->dev); > pm_runtime_put_autosuspend(adev_to_drm(adev)->dev); > > return r; > @@ -1601,7 +1589,6 @@ static ssize_t amdgpu_get_pcie_bw(struct device *dev, > > amdgpu_asic_get_pcie_usage(adev, &count0, &count1); > > - pm_runtime_mark_last_busy(ddev->dev); > pm_runtime_put_autosuspend(ddev->dev); > > return sysfs_emit(buf, "%llu %llu %i\n", > @@ -1734,7 +1721,6 @@ static ssize_t amdgpu_get_apu_thermal_cap(struct device > *dev, > else > size = sysfs_emit(buf, "failed to get thermal limit\n"); > > - pm_runtime_mark_last_busy(ddev->dev); > pm_runtime_put_autosuspend(ddev->dev); > > return size; > @@ -1807,7 +1793,6 @@ static ssize_t amdgpu_get_pm_metrics(struct device *dev, > > size = amdgpu_dpm_get_pm_metrics(adev, buf, PAGE_SIZE); > > - pm_runtime_mark_last_busy(ddev->dev); > pm_runtime_put_autosuspend
Re: radeon ARUBA NULL pointer dereference
Follow-up, qla2xxx appears to be fixed, most recent test was with: commit 684a64bf32b6e488004e0ad7f0d7e922798f65b6 (HEAD -> master, origin/master, origin/HEAD) Merge: f7fccaa77271 68898131d2df Author: Linus Torvalds Date: Tue Sep 24 15:44:18 2024 -0700 Merge tag 'nfs-for-6.12-1' of git://git.linux-nfs.org/projects/anna/linux-nfs I suppose the most likely fix was probably this one but I do not have the HW to verify (report I got was on an AMD EPYC 7262): commit b348b6d17fd1d5d89b86db602f02bea54a754bd8 Author: Leon Romanovsky Date: Sun Sep 22 21:09:48 2024 +0300 dma-mapping: report unlimited DMA addressing in IOMMU DMA path -Ewan On Tue, Sep 24, 2024 at 3:30 PM Ewan Milne wrote: > > I think we are seeing a similar problem w/qla2xxx panicing at boot: >
RE: [PATCH 6/7] drm/amdgpu: Check gmc requirement for reset on init
[AMD Official Use Only - AMD Internal Distribution Only] >>+ refresh = (adev->init_lvl->level != AMDGPU_INIT_LEVEL_MINIMAL_XGMI) && >>+ (adev->gmc.reset_flags & AMDGPU_GMC_INIT_RESET_NPS); Is there a corner case that reloading with a different version tos and refreshing nps change co-exist? Thanks, Feifei -Original Message- From: amd-gfx On Behalf Of Lijo Lazar Sent: Tuesday, September 24, 2024 1:57 PM To: amd-gfx@lists.freedesktop.org Cc: Zhang, Hawking ; Deucher, Alexander ; Koenig, Christian ; Bhardwaj, Rajneesh ; Errabolu, Ramesh Subject: [PATCH 6/7] drm/amdgpu: Check gmc requirement for reset on init Add a callback to check if there is any condition detected by GMC block for reset on init. One case is if a pending NPS change request is detected. If reset is done because of NPS switch, refresh NPS info from discovery table. Signed-off-by: Lijo Lazar --- drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 13 - drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h | 5 + drivers/gpu/drm/amd/amdgpu/soc15.c | 2 ++ 3 files changed, 19 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c index 21f1e65c9dc9..011fe3a847d0 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c @@ -1261,12 +1261,15 @@ int amdgpu_gmc_get_nps_memranges(struct amdgpu_device *adev, struct amdgpu_gmc_memrange *ranges; int range_cnt, ret, i, j; uint32_t nps_type; + bool refresh; if (!mem_ranges) return -EINVAL; + refresh = (adev->init_lvl->level != AMDGPU_INIT_LEVEL_MINIMAL_XGMI) && + (adev->gmc.reset_flags & AMDGPU_GMC_INIT_RESET_NPS); ret = amdgpu_discovery_get_nps_info(adev, &nps_type, &ranges, - &range_cnt, false); + &range_cnt, refresh); if (ret) return ret; @@ -1392,3 +1395,11 @@ void amdgpu_gmc_prepare_nps_mode_change(struct amdgpu_device *adev) adev->dev, "NPS mode change request done, reload driver to complete the change\n"); } + +bool amdgpu_gmc_need_reset_on_init(struct amdgpu_device *adev) { + if (adev->gmc.gmc_funcs->need_reset_on_init) + return adev->gmc.gmc_funcs->need_reset_on_init(adev); + + return false; +} diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h index b13d6adb5efd..d4cd247fe574 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h @@ -78,6 +78,8 @@ enum amdgpu_memory_partition { BIT(AMDGPU_NPS3_PARTITION_MODE) | BIT(AMDGPU_NPS4_PARTITION_MODE) | \ BIT(AMDGPU_NPS6_PARTITION_MODE) | BIT(AMDGPU_NPS8_PARTITION_MODE)) +#define AMDGPU_GMC_INIT_RESET_NPS BIT(0) + /* * GMC page fault information */ @@ -169,6 +171,7 @@ struct amdgpu_gmc_funcs { /* Request NPS mode */ int (*request_mem_partition_mode)(struct amdgpu_device *adev, int nps_mode); + bool (*need_reset_on_init)(struct amdgpu_device *adev); }; struct amdgpu_xgmi_ras { @@ -314,6 +317,7 @@ struct amdgpu_gmc { const struct amdgpu_gmc_funcs *gmc_funcs; enum amdgpu_memory_partitionrequested_nps_mode; uint32_t supported_nps_modes; + uint32_t reset_flags; struct amdgpu_xgmi xgmi; struct amdgpu_irq_src ecc_irq; @@ -468,5 +472,6 @@ int amdgpu_gmc_get_nps_memranges(struct amdgpu_device *adev, int amdgpu_gmc_request_memory_partition(struct amdgpu_device *adev, int nps_mode); void amdgpu_gmc_prepare_nps_mode_change(struct amdgpu_device *adev); +bool amdgpu_gmc_need_reset_on_init(struct amdgpu_device *adev); #endif diff --git a/drivers/gpu/drm/amd/amdgpu/soc15.c b/drivers/gpu/drm/amd/amdgpu/soc15.c index 619933f252aa..97ca4931a7ef 100644 --- a/drivers/gpu/drm/amd/amdgpu/soc15.c +++ b/drivers/gpu/drm/amd/amdgpu/soc15.c @@ -833,6 +833,8 @@ static bool soc15_need_reset_on_init(struct amdgpu_device *adev) if (amdgpu_psp_tos_reload_needed(adev)) return true; + if (amdgpu_gmc_need_reset_on_init(adev)) + return true; /* Just return false for soc15 GPUs. Reset does not seem to * be necessary. */ -- 2.25.1
Re: [PATCH v2 01/08] drm/amdgpu: Implement userqueue signal/wait IOCTL
Am 26.09.24 um 11:31 schrieb Paneer Selvam, Arunpravin: Hi Christian, On 9/26/2024 2:57 PM, Christian König wrote: Am 25.09.24 um 21:59 schrieb Arunpravin Paneer Selvam: [SNIP] +int amdgpu_userq_wait_ioctl(struct drm_device *dev, void *data, + struct drm_file *filp) +{ + struct drm_amdgpu_userq_fence_info *fence_info = NULL; + struct drm_amdgpu_userq_wait *wait_info = data; + u32 *syncobj_handles, *bo_handles; + struct dma_fence **fences = NULL; + u32 num_syncobj, num_bo_handles; + struct drm_gem_object **gobj; + struct drm_exec exec; + int r, i, entry, cnt; + u64 num_fences = 0; + + num_bo_handles = wait_info->num_bo_handles; + bo_handles = memdup_user(u64_to_user_ptr(wait_info->bo_handles_array), + sizeof(u32) * num_bo_handles); + if (IS_ERR(bo_handles)) + return PTR_ERR(bo_handles); + + num_syncobj = wait_info->num_syncobj_handles; + syncobj_handles = memdup_user(u64_to_user_ptr(wait_info->syncobj_handles_array), + sizeof(u32) * num_syncobj); + if (IS_ERR(syncobj_handles)) { + r = PTR_ERR(syncobj_handles); + goto free_bo_handles; + } + + /* Array of GEM object handles */ + gobj = kmalloc_array(num_bo_handles, sizeof(*gobj), GFP_KERNEL); + if (!gobj) { + r = -ENOMEM; + goto free_syncobj_handles; + } + + for (entry = 0; entry < num_bo_handles; entry++) { + gobj[entry] = drm_gem_object_lookup(filp, bo_handles[entry]); + if (!gobj[entry]) { + r = -ENOENT; + goto put_gobj; + } + } + + drm_exec_init(&exec, DRM_EXEC_INTERRUPTIBLE_WAIT, 0); + drm_exec_until_all_locked(&exec) { + r = drm_exec_prepare_array(&exec, gobj, num_bo_handles, 0); + drm_exec_retry_on_contention(&exec); + if (r) { + drm_exec_fini(&exec); + goto put_gobj; + } + } + + if (!wait_info->num_fences) { + /* Count syncobj's fence */ + for (i = 0; i < num_syncobj; i++) { + struct dma_fence *fence; + + r = drm_syncobj_find_fence(filp, syncobj_handles[i], + 0, 0, &fence); + dma_fence_put(fence); + + if (r || !fence) + continue; + + num_fences++; + } + + /* Count GEM objects fence */ + for (i = 0; i < num_bo_handles; i++) { + struct dma_resv_iter resv_cursor; + struct dma_fence *fence; + + dma_resv_for_each_fence(&resv_cursor, gobj[i]->resv, + dma_resv_usage_rw(wait_info->bo_wait_flags & + AMDGPU_USERQ_BO_WRITE), fence) + num_fences++; We should probably adjust the UAPI here once more. The problem is that we only provide the AMDGPU_USERQ_BO_WRITE for the whole IOCTL instead of per BO. So the best approach would probably be to drop the AMDGPU_USERQ_BO_WRITE flag and split up the array of BOs into readers and writers. Can you work on that Arun? Shouldn't be more than a bit typing exercise. Sure, I will modify and send the next version of this file. Thanks. In the meantime I'm going to review the rest of the series, so there could be more comments. But please update the UAPI first. Regards, Christian. Thanks, Arun. Thanks, Christian.
Re: [PATCH v2] drm/sched: Further optimise drm_sched_entity_push_job
On Mon, 2024-09-23 at 15:35 +0100, Tvrtko Ursulin wrote: > > Ping Christian and Philipp - reasonably happy with v2? I think it's > the > only unreviewed patch from the series. Howdy, sry for the delay, I had been traveling. I have a few nits below regarding the commit message. Besides, I'm OK with that, thx for your work :) > > Regards, > > Tvrtko > > On 16/09/2024 18:30, Tvrtko Ursulin wrote: > > From: Tvrtko Ursulin > > > > Having removed one re-lock cycle on the entity->lock in a patch > > titled > > "drm/sched: Optimise drm_sched_entity_push_job", > > with only a tiny bit > > larger refactoring we can do the same optimisation Well, the commit message does not state which optimization that is. One would have to look for the previous patch, which you apparently cannot provide a commit ID for yet because it's not in Big Boss's branch. In this case I am for including a sentence about what is being optimized also because > > on the rq->lock. > > (Currently both drm_sched_rq_add_entity() and > > drm_sched_rq_update_fifo_locked() take and release the same lock.) > > > > To achieve this we make drm_sched_rq_update_fifo_locked() and it's not clear what the "this" that's being achieved is. > > drm_sched_rq_add_entity() expect the rq->lock to be held. > > > > We also align drm_sched_rq_update_fifo_locked(), > > drm_sched_rq_add_entity() and > > drm_sched_rq_remove_fifo_locked() function signatures, by adding rq > > as a > > parameter to the latter. > > > > v2: > > * Fix after rebase of the series. > > * Avoid naming incosistency between drm_sched_rq_add/remove. > > (Christian) > > > > Signed-off-by: Tvrtko Ursulin Reviewed-by: Philipp Stanner > > Cc: Christian König > > Cc: Alex Deucher > > Cc: Luben Tuikov > > Cc: Matthew Brost > > Cc: Philipp Stanner > > --- > > drivers/gpu/drm/scheduler/sched_entity.c | 12 -- > > drivers/gpu/drm/scheduler/sched_main.c | 29 --- > > - > > include/drm/gpu_scheduler.h | 3 ++- > > 3 files changed, 26 insertions(+), 18 deletions(-) > > > > diff --git a/drivers/gpu/drm/scheduler/sched_entity.c > > b/drivers/gpu/drm/scheduler/sched_entity.c > > index d982cebc6bee..8ace1f1ea66b 100644 > > --- a/drivers/gpu/drm/scheduler/sched_entity.c > > +++ b/drivers/gpu/drm/scheduler/sched_entity.c > > @@ -515,9 +515,14 @@ struct drm_sched_job > > *drm_sched_entity_pop_job(struct drm_sched_entity *entity) > > > > next = to_drm_sched_job(spsc_queue_peek(&entity- > > >job_queue)); > > if (next) { > > + struct drm_sched_rq *rq; > > + > > spin_lock(&entity->lock); > > - drm_sched_rq_update_fifo_locked(entity, > > + rq = entity->rq; > > + spin_lock(&rq->lock); > > + drm_sched_rq_update_fifo_locked(entity, > > rq, > > next- > > >submit_ts); > > + spin_unlock(&rq->lock); > > spin_unlock(&entity->lock); > > } > > } > > @@ -618,11 +623,14 @@ void drm_sched_entity_push_job(struct > > drm_sched_job *sched_job) > > sched = rq->sched; > > > > atomic_inc(sched->score); > > + > > + spin_lock(&rq->lock); > > drm_sched_rq_add_entity(rq, entity); > > > > if (drm_sched_policy == DRM_SCHED_POLICY_FIFO) > > - drm_sched_rq_update_fifo_locked(entity, > > submit_ts); > > + drm_sched_rq_update_fifo_locked(entity, > > rq, submit_ts); > > > > + spin_unlock(&rq->lock); > > spin_unlock(&entity->lock); > > > > drm_sched_wakeup(sched, entity); > > diff --git a/drivers/gpu/drm/scheduler/sched_main.c > > b/drivers/gpu/drm/scheduler/sched_main.c > > index 18a952f73ecb..5c83fb92bb89 100644 > > --- a/drivers/gpu/drm/scheduler/sched_main.c > > +++ b/drivers/gpu/drm/scheduler/sched_main.c > > @@ -153,17 +153,18 @@ static __always_inline bool > > drm_sched_entity_compare_before(struct rb_node *a, > > return ktime_before(ent_a->oldest_job_waiting, ent_b- > > >oldest_job_waiting); > > } > > > > -static inline void drm_sched_rq_remove_fifo_locked(struct > > drm_sched_entity *entity) > > +static void drm_sched_rq_remove_fifo_locked(struct I think the commit message should contain a short sentence about why you removed the inline. AKA "As we're at it, remove the inline function specifier from drm_sched_rq_remove_fifo_locked() because XYZ" P. > > drm_sched_entity *entity, > > + struct drm_sched_rq > > *rq) > > { > > - struct drm_sched_rq *rq = entity->rq; > > - > > if (!RB_EMPTY_NODE(&entity->rb_tree_node)) { > > rb_erase_cached(&entity->rb_tree_node, &rq- > > >rb_tree_root); > > RB_CLEAR_NODE(&entity->rb_tree_node); > > } > > } > > > > -void drm_sched_rq_update_fifo_locked(struct drm_sched
Re: [PATCH v2] drm/sched: Further optimise drm_sched_entity_push_job
On Tue, 2024-09-24 at 14:02 +0200, Christian König wrote: > Am 24.09.24 um 11:58 schrieb Tvrtko Ursulin: > > > > On 24/09/2024 10:45, Tvrtko Ursulin wrote: > > > > > > On 24/09/2024 09:20, Christian König wrote: > > > > Am 16.09.24 um 19:30 schrieb Tvrtko Ursulin: > > > > > From: Tvrtko Ursulin > > > > > > > > > > Having removed one re-lock cycle on the entity->lock in a > > > > > patch titled > > > > > "drm/sched: Optimise drm_sched_entity_push_job", with only a > > > > > tiny bit > > > > > larger refactoring we can do the same optimisation on the rq- > > > > > >lock. > > > > > (Currently both drm_sched_rq_add_entity() and > > > > > drm_sched_rq_update_fifo_locked() take and release the same > > > > > lock.) > > > > > > > > > > To achieve this we make drm_sched_rq_update_fifo_locked() and > > > > > drm_sched_rq_add_entity() expect the rq->lock to be held. > > > > > > > > > > We also align drm_sched_rq_update_fifo_locked(), > > > > > drm_sched_rq_add_entity() and > > > > > drm_sched_rq_remove_fifo_locked() function signatures, by > > > > > adding rq > > > > > as a > > > > > parameter to the latter. > > > > > > > > > > v2: > > > > > * Fix after rebase of the series. > > > > > * Avoid naming incosistency between > > > > > drm_sched_rq_add/remove. > > > > > (Christian) > > > > > > > > > > Signed-off-by: Tvrtko Ursulin > > > > > Cc: Christian König > > > > > Cc: Alex Deucher > > > > > Cc: Luben Tuikov > > > > > Cc: Matthew Brost > > > > > Cc: Philipp Stanner > > > > > > > > Reviewed-by: Christian König > > > > > > Thanks! > > > > > > Are you okay to pull into drm-misc-next or we should do some more > > > testing on this? > > > > > > And/or should I resend the series once more in it's entirety so > > > this > > > v2 is not a reply-to to the original? > > > > I have to respin for the drm_sched_wakeup fix that landed. > > When I should push the series to drm-misc-next then please send it to > me > once more. > > On the other hand we should now have to maintainers for that. Yup, will pick up this responsibilities soonish. Danilo and I have been on conference recently and I'm out of office soon for a bit, but you can expect me / us to take over that work soonish in early autumn. Regards, P. > > Regards, > Christian. > > > > > Regards, > > > > Tvrtko > > > > > > > > Regards, > > > > > > Tvrtko > > > > > > > > > > > > --- > > > > > drivers/gpu/drm/scheduler/sched_entity.c | 12 -- > > > > > drivers/gpu/drm/scheduler/sched_main.c | 29 > > > > > > > > > > include/drm/gpu_scheduler.h | 3 ++- > > > > > 3 files changed, 26 insertions(+), 18 deletions(-) > > > > > > > > > > diff --git a/drivers/gpu/drm/scheduler/sched_entity.c > > > > > b/drivers/gpu/drm/scheduler/sched_entity.c > > > > > index d982cebc6bee..8ace1f1ea66b 100644 > > > > > --- a/drivers/gpu/drm/scheduler/sched_entity.c > > > > > +++ b/drivers/gpu/drm/scheduler/sched_entity.c > > > > > @@ -515,9 +515,14 @@ struct drm_sched_job > > > > > *drm_sched_entity_pop_job(struct drm_sched_entity *entity) > > > > > next = > > > > > to_drm_sched_job(spsc_queue_peek(&entity->job_queue)); > > > > > if (next) { > > > > > + struct drm_sched_rq *rq; > > > > > + > > > > > spin_lock(&entity->lock); > > > > > - drm_sched_rq_update_fifo_locked(entity, > > > > > + rq = entity->rq; > > > > > + spin_lock(&rq->lock); > > > > > + drm_sched_rq_update_fifo_locked(entity, rq, > > > > > next->submit_ts); > > > > > + spin_unlock(&rq->lock); > > > > > spin_unlock(&entity->lock); > > > > > } > > > > > } > > > > > @@ -618,11 +623,14 @@ void drm_sched_entity_push_job(struct > > > > > drm_sched_job *sched_job) > > > > > sched = rq->sched; > > > > > atomic_inc(sched->score); > > > > > + > > > > > + spin_lock(&rq->lock); > > > > > drm_sched_rq_add_entity(rq, entity); > > > > > if (drm_sched_policy == DRM_SCHED_POLICY_FIFO) > > > > > - drm_sched_rq_update_fifo_locked(entity, > > > > > submit_ts); > > > > > + drm_sched_rq_update_fifo_locked(entity, rq, > > > > > submit_ts); > > > > > + spin_unlock(&rq->lock); > > > > > spin_unlock(&entity->lock); > > > > > drm_sched_wakeup(sched, entity); > > > > > diff --git a/drivers/gpu/drm/scheduler/sched_main.c > > > > > b/drivers/gpu/drm/scheduler/sched_main.c > > > > > index 18a952f73ecb..5c83fb92bb89 100644 > > > > > --- a/drivers/gpu/drm/scheduler/sched_main.c > > > > > +++ b/drivers/gpu/drm/scheduler/sched_main.c > > > > > @@ -153,17 +153,18 @@ static __always_inline bool > > > > > drm_sched_entity_compare_before(struct rb_node *a, > > > > > return ktime_before(ent_a->oldest_job_waiting, > > > > > ent_b->oldest_job_waiting); > > > > > } > > > > > -static inline void drm_sched_rq_remove_fifo_
[PATCH] drm/amdgpu: fix PTE copy corruption for sdma 7
[AMD Official Use Only - AMD Internal Distribution Only] From: Frank Min Without setting dcc bit, there is ramdon PTE copy corruption on sdma 7. so add this bit and update the packet format accordingly. Signed-off-by: Frank Min --- drivers/gpu/drm/amd/amdgpu/sdma_v7_0.c | 7 +-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v7_0.c b/drivers/gpu/drm/amd/amdgpu/sdma_v7_0.c index cfd8e183ad50..a8763496aed3 100644 --- a/drivers/gpu/drm/amd/amdgpu/sdma_v7_0.c +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v7_0.c @@ -1080,13 +1080,16 @@ static void sdma_v7_0_vm_copy_pte(struct amdgpu_ib *ib, unsigned bytes = count * 8; ib->ptr[ib->length_dw++] = SDMA_PKT_COPY_LINEAR_HEADER_OP(SDMA_OP_COPY) | - SDMA_PKT_COPY_LINEAR_HEADER_SUB_OP(SDMA_SUBOP_COPY_LINEAR); + SDMA_PKT_COPY_LINEAR_HEADER_SUB_OP(SDMA_SUBOP_COPY_LINEAR) | + SDMA_PKT_COPY_LINEAR_HEADER_CPV(1); + ib->ptr[ib->length_dw++] = bytes - 1; ib->ptr[ib->length_dw++] = 0; /* src/dst endian swap */ ib->ptr[ib->length_dw++] = lower_32_bits(src); ib->ptr[ib->length_dw++] = upper_32_bits(src); ib->ptr[ib->length_dw++] = lower_32_bits(pe); ib->ptr[ib->length_dw++] = upper_32_bits(pe); + ib->ptr[ib->length_dw++] = 0; } @@ -1744,7 +1747,7 @@ static void sdma_v7_0_set_buffer_funcs(struct amdgpu_device *adev) } static const struct amdgpu_vm_pte_funcs sdma_v7_0_vm_pte_funcs = { - .copy_pte_num_dw = 7, + .copy_pte_num_dw = 8, .copy_pte = sdma_v7_0_vm_copy_pte, .write_pte = sdma_v7_0_vm_write_pte, .set_pte_pde = sdma_v7_0_vm_set_pte_pde, -- 2.43.0
Re: [PATCH v1 5/9] drm/amd/pm: use pm_runtime_get_if_active for debugfs getters
On 9/25/2024 8:02 PM, Pierre-Eric Pelloux-Prayer wrote: > > > Le 25/09/2024 à 15:35, Lazar, Lijo a écrit : >> >> >> On 9/25/2024 1:24 PM, Pierre-Eric Pelloux-Prayer wrote: >>> Don't wake up the GPU for reading pm values. Instead, take a runtime >>> powermanagement ref when trying to read it iff the GPU is already >>> awake. >>> >>> This avoids spurious wake ups (eg: from applets). >>> >>> We use pm_runtime_get_if_in_active(ign_usage_count=true) because we care >>> about "is the GPU awake?" not about "is the GPU awake and something else >>> prevents suspend?". >>> >>> Tested-by: Mario Limonciello >>> Signed-off-by: Pierre-Eric Pelloux-Prayer >>> >>> --- >>> drivers/gpu/drm/amd/pm/amdgpu_pm.c | 138 ++--- >>> 1 file changed, 69 insertions(+), 69 deletions(-) >>> >>> diff --git a/drivers/gpu/drm/amd/pm/amdgpu_pm.c >>> b/drivers/gpu/drm/amd/pm/amdgpu_pm.c >>> index c8f34b1a4d8e..f1f339b75380 100644 >>> --- a/drivers/gpu/drm/amd/pm/amdgpu_pm.c >>> +++ b/drivers/gpu/drm/amd/pm/amdgpu_pm.c >>> @@ -145,9 +145,9 @@ static ssize_t amdgpu_get_power_dpm_state(struct >>> device *dev, >>> if (adev->in_suspend && !adev->in_runpm) >>> return -EPERM; >>> - ret = pm_runtime_resume_and_get(ddev->dev); >>> - if (ret < 0) >>> - return ret; >>> + ret = pm_runtime_get_if_active(ddev->dev, true); >>> + if (ret <= 0) >>> + return ret ?: -EPERM; >> >> Not all our devices support runtime PM. >> >> As per the API documentation - >> >> "-EINVAL is returned if runtime PM is disabled for the >> device, in which case also the usage_count will remain unmodified." > > This doc is about "dev->power.disable_depth > 0", not about runtime PM > being disabled (like you would get by using pm_runtime_forbid). > > When runtime PM is disabled, usage_count is always >= 1 and the status > is RPM_ACTIVE so pm_runtime_get_if_active will succeed. > > I tested on 2 dGPUs without runtime PM, and both seemed to work fine. > Thanks for clarifying. Found that default enablement is taken care by pci base driver. https://elixir.bootlin.com/linux/v6.11/source/drivers/pci/pci.c#L3173 Thanks, Lijo > Pierre-Eric > >> >> If that's true, this won't be working on devices without runtime PM >> support. >> >> Thanks, >> Lijo >>> amdgpu_dpm_get_current_power_state(adev, &pm); >>> @@ -268,9 +268,9 @@ static ssize_t >>> amdgpu_get_power_dpm_force_performance_level(struct device *dev, >>> if (adev->in_suspend && !adev->in_runpm) >>> return -EPERM; >>> - ret = pm_runtime_resume_and_get(ddev->dev); >>> - if (ret < 0) >>> - return ret; >>> + ret = pm_runtime_get_if_active(ddev->dev, true); >>> + if (ret <= 0) >>> + return ret ?: -EPERM; >>> level = amdgpu_dpm_get_performance_level(adev); >>> @@ -364,9 +364,9 @@ static ssize_t amdgpu_get_pp_num_states(struct >>> device *dev, >>> if (adev->in_suspend && !adev->in_runpm) >>> return -EPERM; >>> - ret = pm_runtime_resume_and_get(ddev->dev); >>> - if (ret < 0) >>> - return ret; >>> + ret = pm_runtime_get_if_active(ddev->dev, true); >>> + if (ret <= 0) >>> + return ret ?: -EPERM; >>> if (amdgpu_dpm_get_pp_num_states(adev, &data)) >>> memset(&data, 0, sizeof(data)); >>> @@ -399,9 +399,9 @@ static ssize_t amdgpu_get_pp_cur_state(struct >>> device *dev, >>> if (adev->in_suspend && !adev->in_runpm) >>> return -EPERM; >>> - ret = pm_runtime_resume_and_get(ddev->dev); >>> - if (ret < 0) >>> - return ret; >>> + ret = pm_runtime_get_if_active(ddev->dev, true); >>> + if (ret <= 0) >>> + return ret ?: -EPERM; >>> amdgpu_dpm_get_current_power_state(adev, &pm); >>> @@ -526,9 +526,9 @@ static ssize_t amdgpu_get_pp_table(struct >>> device *dev, >>> if (adev->in_suspend && !adev->in_runpm) >>> return -EPERM; >>> - ret = pm_runtime_resume_and_get(ddev->dev); >>> - if (ret < 0) >>> - return ret; >>> + ret = pm_runtime_get_if_active(ddev->dev, true); >>> + if (ret <= 0) >>> + return ret ?: -EPERM; >>> size = amdgpu_dpm_get_pp_table(adev, &table); >>> @@ -840,9 +840,9 @@ static ssize_t >>> amdgpu_get_pp_od_clk_voltage(struct device *dev, >>> if (adev->in_suspend && !adev->in_runpm) >>> return -EPERM; >>> - ret = pm_runtime_resume_and_get(ddev->dev); >>> - if (ret < 0) >>> - return ret; >>> + ret = pm_runtime_get_if_active(ddev->dev, true); >>> + if (ret <= 0) >>> + return ret ?: -EPERM; >>> for (clk_index = 0 ; clk_index < 6 ; clk_index++) { >>> ret = amdgpu_dpm_emit_clock_levels(adev, >>> od_clocks[clk_index], buf, &size); >>> @@ -930,9 +930,9 @@ static ssize_t amdgpu_get_pp_features(struct >>> device *dev, >>> if (adev->in_suspend && !adev->in_runpm) >>> return -EPERM; >>> - ret = pm_runtime_resume_and_get(ddev->dev); >>> - if (ret < 0) >>> -
RE: [PATCH 5/7] drm/amdgpu: Place NPS mode request on unload
[AMD Official Use Only - AMD Internal Distribution Only] Reviewed-by: Feifei Xu -Original Message- From: amd-gfx On Behalf Of Lijo Lazar Sent: Tuesday, September 24, 2024 1:57 PM To: amd-gfx@lists.freedesktop.org Cc: Zhang, Hawking ; Deucher, Alexander ; Koenig, Christian ; Bhardwaj, Rajneesh ; Errabolu, Ramesh Subject: [PATCH 5/7] drm/amdgpu: Place NPS mode request on unload If a user has requested NPS mode switch, place the request through PSP during unload of the driver. For devices which are part of a hive, all requests are placed together. If one of them fails, revert back to the current NPS mode. Signed-off-by: Lijo Lazar Signed-off-by: Rajneesh Bhardwaj --- drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 1 + drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 47 drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h | 2 + drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c | 38 +++ drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h | 4 ++ 5 files changed, 92 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c index 95331294509c..d16cdcdb2114 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c @@ -2428,6 +2428,7 @@ amdgpu_pci_remove(struct pci_dev *pdev) struct amdgpu_device *adev = drm_to_adev(dev); amdgpu_xcp_dev_unplug(adev); + amdgpu_gmc_prepare_nps_mode_change(adev); drm_dev_unplug(dev); if (adev->pm.rpm_mode != AMDGPU_RUNPM_NONE) { diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c index 24a1f931d9ed..21f1e65c9dc9 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c @@ -1345,3 +1345,50 @@ int amdgpu_gmc_request_memory_partition(struct amdgpu_device *adev, return psp_memory_partition(&adev->psp, nps_mode); } + +static inline bool amdgpu_gmc_need_nps_switch_req(struct amdgpu_device *adev, + int req_nps_mode, + int cur_nps_mode) +{ + return (((BIT(req_nps_mode) & adev->gmc.supported_nps_modes) == + BIT(req_nps_mode)) && + req_nps_mode != cur_nps_mode); +} + +void amdgpu_gmc_prepare_nps_mode_change(struct amdgpu_device *adev) { + int req_nps_mode, cur_nps_mode, r; + struct amdgpu_hive_info *hive; + + if (amdgpu_sriov_vf(adev) || !adev->gmc.supported_nps_modes || + !adev->gmc.gmc_funcs->request_mem_partition_mode) + return; + + cur_nps_mode = adev->gmc.gmc_funcs->query_mem_partition_mode(adev); + hive = amdgpu_get_xgmi_hive(adev); + if (hive) { + req_nps_mode = atomic_read(&hive->requested_nps_mode); + if (!amdgpu_gmc_need_nps_switch_req(adev, req_nps_mode, + cur_nps_mode)) { + amdgpu_put_xgmi_hive(hive); + return; + } + r = amdgpu_xgmi_request_nps_change(adev, hive, req_nps_mode); + amdgpu_put_xgmi_hive(hive); + goto out; + } + + req_nps_mode = adev->gmc.requested_nps_mode; + if (!amdgpu_gmc_need_nps_switch_req(adev, req_nps_mode, cur_nps_mode)) + return; + + /* even if this fails, we should let driver unload w/o blocking */ + r = adev->gmc.gmc_funcs->request_mem_partition_mode(adev, +req_nps_mode); +out: + if (r) + dev_err(adev->dev, "NPS mode change request failed\n"); + else + dev_info( + adev->dev, + "NPS mode change request done, reload driver to complete the +change\n"); } diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h index 1a50639a003a..b13d6adb5efd 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h @@ -467,4 +467,6 @@ int amdgpu_gmc_get_nps_memranges(struct amdgpu_device *adev, int amdgpu_gmc_request_memory_partition(struct amdgpu_device *adev, int nps_mode); +void amdgpu_gmc_prepare_nps_mode_change(struct amdgpu_device *adev); + #endif diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c index 5d721ccb9dfd..db2c1b11b813 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c @@ -1564,3 +1564,41 @@ int amdgpu_xgmi_reset_on_init(struct amdgpu_device *adev) return 0; } + +int amdgpu_xgmi_request_nps_change(struct amdgpu_device *adev, + struct amdgpu_hive_info *hive, + int req_nps_mode) +{ + struct amdgpu_device *tmp_adev; + int cur_nps_mode, r; + + /* This is expected to be called only during unload of drive
RE: [PATCH 7/7] drm/amdgpu: Add NPS switch support for GC 9.4.3
[AMD Official Use Only - AMD Internal Distribution Only] Reviewed-by: Feifei Xu -Original Message- From: amd-gfx On Behalf Of Lijo Lazar Sent: Tuesday, September 24, 2024 1:57 PM To: amd-gfx@lists.freedesktop.org Cc: Zhang, Hawking ; Deucher, Alexander ; Koenig, Christian ; Bhardwaj, Rajneesh ; Errabolu, Ramesh Subject: [PATCH 7/7] drm/amdgpu: Add NPS switch support for GC 9.4.3 Add dynamic NPS switch support for GC 9.4.3 variants. Only GC v9.4.3 and GC v9.4.4 currently support this. NPS switch is only supported if an SOC supports multiple NPS modes. Signed-off-by: Lijo Lazar Signed-off-by: Rajneesh Bhardwaj --- drivers/gpu/drm/amd/amdgpu/amdgpu_nbio.h | 1 + drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c| 44 drivers/gpu/drm/amd/amdgpu/nbio_v7_9.c | 12 +++ 3 files changed, 57 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_nbio.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_nbio.h index f61d117b0caf..79c2f807b9fe 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_nbio.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_nbio.h @@ -101,6 +101,7 @@ struct amdgpu_nbio_funcs { int (*get_compute_partition_mode)(struct amdgpu_device *adev); u32 (*get_memory_partition_mode)(struct amdgpu_device *adev, u32 *supp_modes); + bool (*is_nps_switch_requested)(struct amdgpu_device *adev); u64 (*get_pcie_replay_count)(struct amdgpu_device *adev); void (*set_reg_remap)(struct amdgpu_device *adev); }; diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c index cafcb24449df..6a95402985ef 100644 --- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c @@ -1395,6 +1395,17 @@ gmc_v9_0_query_memory_partition(struct amdgpu_device *adev) return gmc_v9_0_get_memory_partition(adev, NULL); } +static bool gmc_v9_0_need_reset_on_init(struct amdgpu_device *adev) { + if (adev->nbio.funcs && + adev->nbio.funcs->is_nps_switch_requested(adev)) { + adev->gmc.reset_flags |= AMDGPU_GMC_INIT_RESET_NPS; + return true; + } + + return false; +} + static const struct amdgpu_gmc_funcs gmc_v9_0_gmc_funcs = { .flush_gpu_tlb = gmc_v9_0_flush_gpu_tlb, .flush_gpu_tlb_pasid = gmc_v9_0_flush_gpu_tlb_pasid, @@ -1406,6 +1417,8 @@ static const struct amdgpu_gmc_funcs gmc_v9_0_gmc_funcs = { .override_vm_pte_flags = gmc_v9_0_override_vm_pte_flags, .get_vbios_fb_size = gmc_v9_0_get_vbios_fb_size, .query_mem_partition_mode = &gmc_v9_0_query_memory_partition, + .request_mem_partition_mode = &amdgpu_gmc_request_memory_partition, + .need_reset_on_init = &gmc_v9_0_need_reset_on_init, }; static void gmc_v9_0_set_gmc_funcs(struct amdgpu_device *adev) @@ -1545,6 +1558,28 @@ static void gmc_v9_0_set_xgmi_ras_funcs(struct amdgpu_device *adev) adev->gmc.xgmi.ras = &xgmi_ras; } +static void gmc_v9_0_init_nps_details(struct amdgpu_device *adev) { + adev->gmc.supported_nps_modes = 0; + + if (amdgpu_sriov_vf(adev) || (adev->flags & AMD_IS_APU)) + return; + + /*TODO: Check PSP version also which supports NPS switch. Otherwise keep +* supported modes as 0. +*/ + switch (amdgpu_ip_version(adev, GC_HWIP, 0)) { + case IP_VERSION(9, 4, 3): + case IP_VERSION(9, 4, 4): + adev->gmc.supported_nps_modes = + BIT(AMDGPU_NPS1_PARTITION_MODE) | + BIT(AMDGPU_NPS4_PARTITION_MODE); + break; + default: + break; + } +} + static int gmc_v9_0_early_init(void *handle) { struct amdgpu_device *adev = (struct amdgpu_device *)handle; @@ -2165,6 +2200,7 @@ static int gmc_v9_0_sw_init(void *handle) if (r) return r; + gmc_v9_0_init_nps_details(adev); /* * number of VMs * VMID 0 is reserved for System @@ -2440,6 +2476,14 @@ static int gmc_v9_0_resume(void *handle) int r; struct amdgpu_device *adev = (struct amdgpu_device *)handle; + /* If a reset is done for NPS mode switch, read the memory range +* information again. +*/ + if (adev->gmc.reset_flags & AMDGPU_GMC_INIT_RESET_NPS) { + gmc_v9_0_init_sw_mem_ranges(adev, adev->gmc.mem_partitions); + adev->gmc.reset_flags &= ~AMDGPU_GMC_INIT_RESET_NPS; + } + r = gmc_v9_0_hw_init(adev); if (r) return r; diff --git a/drivers/gpu/drm/amd/amdgpu/nbio_v7_9.c b/drivers/gpu/drm/amd/amdgpu/nbio_v7_9.c index d1bd79bbae53..8a0a63ac88d2 100644 --- a/drivers/gpu/drm/amd/amdgpu/nbio_v7_9.c +++ b/drivers/gpu/drm/amd/amdgpu/nbio_v7_9.c @@ -401,6 +401,17 @@ static int nbio_v7_9_get_compute_partition_mode(struct amdgpu_device *adev) return px; } +static bool nbio_v7_9_is_nps_switch_requested(struct amdgp
RE: [PATCH 2/7] drm/amdgpu: Add PSP interface for NPS switch
[AMD Official Use Only - AMD Internal Distribution Only] Reviewed-by: Feifei Xu -Original Message- From: amd-gfx On Behalf Of Lijo Lazar Sent: Tuesday, September 24, 2024 1:57 PM To: amd-gfx@lists.freedesktop.org Cc: Zhang, Hawking ; Deucher, Alexander ; Koenig, Christian ; Bhardwaj, Rajneesh ; Errabolu, Ramesh Subject: [PATCH 2/7] drm/amdgpu: Add PSP interface for NPS switch Implement PSP ring command interface for memory partitioning on the fly on the supported asics. Signed-off-by: Rajneesh Bhardwaj --- drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 25 + drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h | 1 + drivers/gpu/drm/amd/amdgpu/psp_gfx_if.h | 14 +++--- 3 files changed, 37 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c index 944dad9ad29f..04be0fabb4f5 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c @@ -1045,6 +1045,31 @@ static int psp_rl_load(struct amdgpu_device *adev) return ret; } +int psp_memory_partition(struct psp_context *psp, int mode) { + struct psp_gfx_cmd_resp *cmd; + int ret; + + if (amdgpu_sriov_vf(psp->adev)) + return 0; + + cmd = acquire_psp_cmd_buf(psp); + + cmd->cmd_id = GFX_CMD_ID_FB_NPS_MODE; + cmd->cmd.cmd_memory_part.mode = mode; + + dev_info(psp->adev->dev, +"Requesting %d memory partition change through PSP", mode); + ret = psp_cmd_submit_buf(psp, NULL, cmd, psp->fence_buf_mc_addr); + if (ret) + dev_err(psp->adev->dev, + "PSP request failed to change to NPS%d mode\n", mode); + + release_psp_cmd_buf(psp); + + return ret; +} + int psp_spatial_partition(struct psp_context *psp, int mode) { struct psp_gfx_cmd_resp *cmd; diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h index 76fa18ffc045..567cb1f924ca 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h @@ -553,6 +553,7 @@ int psp_load_fw_list(struct psp_context *psp, void psp_copy_fw(struct psp_context *psp, uint8_t *start_addr, uint32_t bin_size); int psp_spatial_partition(struct psp_context *psp, int mode); +int psp_memory_partition(struct psp_context *psp, int mode); int is_psp_fw_valid(struct psp_bin_desc bin); diff --git a/drivers/gpu/drm/amd/amdgpu/psp_gfx_if.h b/drivers/gpu/drm/amd/amdgpu/psp_gfx_if.h index 604301371e4f..f4a91b126c73 100644 --- a/drivers/gpu/drm/amd/amdgpu/psp_gfx_if.h +++ b/drivers/gpu/drm/amd/amdgpu/psp_gfx_if.h @@ -103,8 +103,10 @@ enum psp_gfx_cmd_id GFX_CMD_ID_AUTOLOAD_RLC = 0x0021, /* Indicates all graphics fw loaded, start RLC autoload */ GFX_CMD_ID_BOOT_CFG = 0x0022, /* Boot Config */ GFX_CMD_ID_SRIOV_SPATIAL_PART = 0x0027, /* Configure spatial partitioning mode */ - /*IDs of performance monitoring/profiling*/ - GFX_CMD_ID_CONFIG_SQ_PERFMON = 0x0046, /* Config CGTT_SQ_CLK_CTRL */ +/*IDs of performance monitoring/profiling*/ +GFX_CMD_ID_CONFIG_SQ_PERFMON = 0x0046, /* Config CGTT_SQ_CLK_CTRL */ +/* Dynamic memory partitioninig (NPS mode change)*/ +GFX_CMD_ID_FB_NPS_MODE = 0x0048, /* Configure memory partitioning mode */ }; /* PSP boot config sub-commands */ @@ -362,6 +364,11 @@ struct psp_gfx_cmd_config_sq_perfmon { uint8_t reserved[5]; }; +struct psp_gfx_cmd_fb_memory_part { + uint32_t mode; /* requested NPS mode */ + uint32_t resvd; +}; + /* All GFX ring buffer commands. */ union psp_gfx_commands { @@ -376,7 +383,8 @@ union psp_gfx_commands struct psp_gfx_cmd_load_toc cmd_load_toc; struct psp_gfx_cmd_boot_cfg boot_cfg; struct psp_gfx_cmd_sriov_spatial_part cmd_spatial_part; - struct psp_gfx_cmd_config_sq_perfmon config_sq_perfmon; +struct psp_gfx_cmd_config_sq_perfmon config_sq_perfmon; +struct psp_gfx_cmd_fb_memory_part cmd_memory_part; }; struct psp_gfx_uresp_reserved -- 2.25.1
RE: [PATCH 3/7] drm/amdgpu: Add gmc interface to request NPS mode
[AMD Official Use Only - AMD Internal Distribution Only] Reviewed-by: Feifei Xu -Original Message- From: amd-gfx On Behalf Of Lijo Lazar Sent: Tuesday, September 24, 2024 1:57 PM To: amd-gfx@lists.freedesktop.org Cc: Zhang, Hawking ; Deucher, Alexander ; Koenig, Christian ; Bhardwaj, Rajneesh ; Errabolu, Ramesh Subject: [PATCH 3/7] drm/amdgpu: Add gmc interface to request NPS mode Add a common interface in GMC to request NPS mode through PSP. Also add a variable in hive and gmc control to track the last requested mode. Signed-off-by: Rajneesh Bhardwaj Signed-off-by: Lijo Lazar --- drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 16 drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h | 6 ++ drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c | 1 + drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h | 1 + 4 files changed, 24 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c index 4f088a5368d8..758fda4e628f 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c @@ -1247,3 +1247,19 @@ int amdgpu_gmc_get_nps_memranges(struct amdgpu_device *adev, return ret; } + +int amdgpu_gmc_request_memory_partition(struct amdgpu_device *adev, + int nps_mode) +{ + /* Not supported on VF devices and APUs */ + if (amdgpu_sriov_vf(adev) || (adev->flags & AMD_IS_APU)) + return -EOPNOTSUPP; + + if (!adev->psp.funcs) { + dev_err(adev->dev, + "PSP interface not available for nps mode change request"); + return -EINVAL; + } + + return psp_memory_partition(&adev->psp, nps_mode); } diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h index 33b2adffd58b..f5be5112b742 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h @@ -161,6 +161,9 @@ struct amdgpu_gmc_funcs { enum amdgpu_memory_partition (*query_mem_partition_mode)( struct amdgpu_device *adev); + /* Request NPS mode */ + int (*request_mem_partition_mode)(struct amdgpu_device *adev, + int nps_mode); }; struct amdgpu_xgmi_ras { @@ -304,6 +307,7 @@ struct amdgpu_gmc { struct amdgpu_mem_partition_info *mem_partitions; uint8_t num_mem_partitions; const struct amdgpu_gmc_funcs *gmc_funcs; + enum amdgpu_memory_partitionrequested_nps_mode; struct amdgpu_xgmi xgmi; struct amdgpu_irq_src ecc_irq; @@ -455,4 +459,6 @@ int amdgpu_gmc_get_nps_memranges(struct amdgpu_device *adev, struct amdgpu_mem_partition_info *mem_ranges, int exp_ranges); +int amdgpu_gmc_request_memory_partition(struct amdgpu_device *adev, + int nps_mode); #endif diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c index b17e63c98a99..5d721ccb9dfd 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c @@ -667,6 +667,7 @@ struct amdgpu_hive_info *amdgpu_get_xgmi_hive(struct amdgpu_device *adev) task_barrier_init(&hive->tb); hive->pstate = AMDGPU_XGMI_PSTATE_UNKNOWN; hive->hi_req_gpu = NULL; + atomic_set(&hive->requested_nps_mode, UNKNOWN_MEMORY_PARTITION_MODE); /* * hive pstate on boot is high in vega20 so we have to go to low diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h index d652727ca565..67abadb4f298 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h @@ -46,6 +46,7 @@ struct amdgpu_hive_info { atomic_t ras_recovery; struct ras_event_manager event_mgr; struct work_struct reset_on_init_work; + atomic_t requested_nps_mode; }; struct amdgpu_pcs_ras_field { -- 2.25.1
Re: [PATCH v2] drm/sched: Further optimise drm_sched_entity_push_job
Am 26.09.24 um 10:18 schrieb Philipp Stanner: On Tue, 2024-09-24 at 14:02 +0200, Christian König wrote: Am 24.09.24 um 11:58 schrieb Tvrtko Ursulin: On 24/09/2024 10:45, Tvrtko Ursulin wrote: On 24/09/2024 09:20, Christian König wrote: Am 16.09.24 um 19:30 schrieb Tvrtko Ursulin: From: Tvrtko Ursulin Having removed one re-lock cycle on the entity->lock in a patch titled "drm/sched: Optimise drm_sched_entity_push_job", with only a tiny bit larger refactoring we can do the same optimisation on the rq- lock. (Currently both drm_sched_rq_add_entity() and drm_sched_rq_update_fifo_locked() take and release the same lock.) To achieve this we make drm_sched_rq_update_fifo_locked() and drm_sched_rq_add_entity() expect the rq->lock to be held. We also align drm_sched_rq_update_fifo_locked(), drm_sched_rq_add_entity() and drm_sched_rq_remove_fifo_locked() function signatures, by adding rq as a parameter to the latter. v2: * Fix after rebase of the series. * Avoid naming incosistency between drm_sched_rq_add/remove. (Christian) Signed-off-by: Tvrtko Ursulin Cc: Christian König Cc: Alex Deucher Cc: Luben Tuikov Cc: Matthew Brost Cc: Philipp Stanner Reviewed-by: Christian König Thanks! Are you okay to pull into drm-misc-next or we should do some more testing on this? And/or should I resend the series once more in it's entirety so this v2 is not a reply-to to the original? I have to respin for the drm_sched_wakeup fix that landed. When I should push the series to drm-misc-next then please send it to me once more. On the other hand we should now have to maintainers for that. Yup, will pick up this responsibilities soonish. Danilo and I have been on conference recently and I'm out of office soon for a bit, but you can expect me / us to take over that work soonish in early autumn. But please don't push this set. I have messed up the push because I didn't realized that the first three patches were supposed to land in -fixes instead of -next. Should be cleaned up my me by the end of the day. Sorry for the noise, Christian. Regards, P. Regards, Christian. Regards, Tvrtko Regards, Tvrtko --- drivers/gpu/drm/scheduler/sched_entity.c | 12 -- drivers/gpu/drm/scheduler/sched_main.c | 29 include/drm/gpu_scheduler.h | 3 ++- 3 files changed, 26 insertions(+), 18 deletions(-) diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c index d982cebc6bee..8ace1f1ea66b 100644 --- a/drivers/gpu/drm/scheduler/sched_entity.c +++ b/drivers/gpu/drm/scheduler/sched_entity.c @@ -515,9 +515,14 @@ struct drm_sched_job *drm_sched_entity_pop_job(struct drm_sched_entity *entity) next = to_drm_sched_job(spsc_queue_peek(&entity->job_queue)); if (next) { + struct drm_sched_rq *rq; + spin_lock(&entity->lock); - drm_sched_rq_update_fifo_locked(entity, + rq = entity->rq; + spin_lock(&rq->lock); + drm_sched_rq_update_fifo_locked(entity, rq, next->submit_ts); + spin_unlock(&rq->lock); spin_unlock(&entity->lock); } } @@ -618,11 +623,14 @@ void drm_sched_entity_push_job(struct drm_sched_job *sched_job) sched = rq->sched; atomic_inc(sched->score); + + spin_lock(&rq->lock); drm_sched_rq_add_entity(rq, entity); if (drm_sched_policy == DRM_SCHED_POLICY_FIFO) - drm_sched_rq_update_fifo_locked(entity, submit_ts); + drm_sched_rq_update_fifo_locked(entity, rq, submit_ts); + spin_unlock(&rq->lock); spin_unlock(&entity->lock); drm_sched_wakeup(sched, entity); diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c index 18a952f73ecb..5c83fb92bb89 100644 --- a/drivers/gpu/drm/scheduler/sched_main.c +++ b/drivers/gpu/drm/scheduler/sched_main.c @@ -153,17 +153,18 @@ static __always_inline bool drm_sched_entity_compare_before(struct rb_node *a, return ktime_before(ent_a->oldest_job_waiting, ent_b->oldest_job_waiting); } -static inline void drm_sched_rq_remove_fifo_locked(struct drm_sched_entity *entity) +static void drm_sched_rq_remove_fifo_locked(struct drm_sched_entity *entity, + struct drm_sched_rq *rq) { - struct drm_sched_rq *rq = entity->rq; - if (!RB_EMPTY_NODE(&entity->rb_tree_node)) { rb_erase_cached(&entity->rb_tree_node, &rq- rb_tree_root); RB_CLEAR_NODE(&entity->rb_tree_node); } } -void drm_sched_rq_update_fifo_locked(struct drm_sched_entity *entity, ktime_t ts) +void drm_sched_rq_update_fifo_locked(struct drm_sched_entity *entity, + struct drm_sched_rq *rq, + ktime_t ts) { /* * Both locks need to be grabbed, one to protect from entity->rq change @@ -171,17 +172,14 @@ void
Re: [PATCH v1 4/9] drm/amd/pm: don't update runpm last_usage on debugfs getter
Le 26/09/2024 à 10:55, Lazar, Lijo a écrit : On 9/25/2024 1:24 PM, Pierre-Eric Pelloux-Prayer wrote: Reading pm values from the GPU shouldn't prevent it to be suspended by resetting the last active timestamp (eg: if an background app monitors GPU sensors every second, it would prevent the autosuspend sequence to trigger). Tested-by: Mario Limonciello Signed-off-by: Pierre-Eric Pelloux-Prayer --- drivers/gpu/drm/amd/pm/amdgpu_pm.c | 26 -- 1 file changed, 26 deletions(-) diff --git a/drivers/gpu/drm/amd/pm/amdgpu_pm.c b/drivers/gpu/drm/amd/pm/amdgpu_pm.c index 042a4dd1bd6a..c8f34b1a4d8e 100644 --- a/drivers/gpu/drm/amd/pm/amdgpu_pm.c +++ b/drivers/gpu/drm/amd/pm/amdgpu_pm.c @@ -151,7 +151,6 @@ static ssize_t amdgpu_get_power_dpm_state(struct device *dev, amdgpu_dpm_get_current_power_state(adev, &pm); - pm_runtime_mark_last_busy(ddev->dev); pm_runtime_put_autosuspend(ddev->dev); return sysfs_emit(buf, "%s\n", @@ -275,7 +274,6 @@ static ssize_t amdgpu_get_power_dpm_force_performance_level(struct device *dev, level = amdgpu_dpm_get_performance_level(adev); - pm_runtime_mark_last_busy(ddev->dev); pm_runtime_put_autosuspend(ddev->dev); return sysfs_emit(buf, "%s\n", @@ -373,7 +371,6 @@ static ssize_t amdgpu_get_pp_num_states(struct device *dev, if (amdgpu_dpm_get_pp_num_states(adev, &data)) memset(&data, 0, sizeof(data)); - pm_runtime_mark_last_busy(ddev->dev); pm_runtime_put_autosuspend(ddev->dev); buf_len = sysfs_emit(buf, "states: %d\n", data.nums); @@ -410,7 +407,6 @@ static ssize_t amdgpu_get_pp_cur_state(struct device *dev, ret = amdgpu_dpm_get_pp_num_states(adev, &data); - pm_runtime_mark_last_busy(ddev->dev); pm_runtime_put_autosuspend(ddev->dev); if (ret) @@ -536,7 +532,6 @@ static ssize_t amdgpu_get_pp_table(struct device *dev, size = amdgpu_dpm_get_pp_table(adev, &table); - pm_runtime_mark_last_busy(ddev->dev); pm_runtime_put_autosuspend(ddev->dev); if (size <= 0) @@ -866,7 +861,6 @@ static ssize_t amdgpu_get_pp_od_clk_voltage(struct device *dev, if (size == 0) size = sysfs_emit(buf, "\n"); - pm_runtime_mark_last_busy(ddev->dev); pm_runtime_put_autosuspend(ddev->dev); return size; @@ -944,7 +938,6 @@ static ssize_t amdgpu_get_pp_features(struct device *dev, if (size <= 0) size = sysfs_emit(buf, "\n"); - pm_runtime_mark_last_busy(ddev->dev); pm_runtime_put_autosuspend(ddev->dev); return size; @@ -1014,7 +1007,6 @@ static ssize_t amdgpu_get_pp_dpm_clock(struct device *dev, if (size == 0) size = sysfs_emit(buf, "\n"); - pm_runtime_mark_last_busy(ddev->dev); pm_runtime_put_autosuspend(ddev->dev); return size; @@ -1259,7 +1251,6 @@ static ssize_t amdgpu_get_pp_sclk_od(struct device *dev, value = amdgpu_dpm_get_sclk_od(adev); - pm_runtime_mark_last_busy(ddev->dev); pm_runtime_put_autosuspend(ddev->dev); return sysfs_emit(buf, "%d\n", value); @@ -1317,7 +1308,6 @@ static ssize_t amdgpu_get_pp_mclk_od(struct device *dev, value = amdgpu_dpm_get_mclk_od(adev); - pm_runtime_mark_last_busy(ddev->dev); pm_runtime_put_autosuspend(ddev->dev); return sysfs_emit(buf, "%d\n", value); @@ -1397,7 +1387,6 @@ static ssize_t amdgpu_get_pp_power_profile_mode(struct device *dev, if (size <= 0) size = sysfs_emit(buf, "\n"); - pm_runtime_mark_last_busy(ddev->dev); pm_runtime_put_autosuspend(ddev->dev); return size; @@ -1485,7 +1474,6 @@ static int amdgpu_hwmon_get_sensor_generic(struct amdgpu_device *adev, /* get the sensor value */ r = amdgpu_dpm_read_sensor(adev, sensor, query, &size); - pm_runtime_mark_last_busy(adev_to_drm(adev)->dev); pm_runtime_put_autosuspend(adev_to_drm(adev)->dev); return r; @@ -1601,7 +1589,6 @@ static ssize_t amdgpu_get_pcie_bw(struct device *dev, amdgpu_asic_get_pcie_usage(adev, &count0, &count1); - pm_runtime_mark_last_busy(ddev->dev); pm_runtime_put_autosuspend(ddev->dev); return sysfs_emit(buf, "%llu %llu %i\n", @@ -1734,7 +1721,6 @@ static ssize_t amdgpu_get_apu_thermal_cap(struct device *dev, else size = sysfs_emit(buf, "failed to get thermal limit\n"); - pm_runtime_mark_last_busy(ddev->dev); pm_runtime_put_autosuspend(ddev->dev); return size; @@ -1807,7 +1793,6 @@ static ssize_t amdgpu_get_pm_metrics(struct device *dev, size = amdgpu_dpm_get_pm_metrics(adev, buf, PAGE_SIZE); - pm_runtime_mark_last_busy(ddev->dev); pm_runtime_put_autosuspend(ddev->dev); return size; @@ -1854,7 +1839,6 @@ static ssize_t amdgpu_get_gpu_metrics(struct device *dev, memcpy(buf, gpu_metrics, size); out: - pm_runtime_mark_last_busy(ddev->dev); pm_runtime_put_aut
Re: [PATCH v1 5/9] drm/amd/pm: use pm_runtime_get_if_active for debugfs getters
On 9/25/2024 1:24 PM, Pierre-Eric Pelloux-Prayer wrote: > Don't wake up the GPU for reading pm values. Instead, take a runtime > powermanagement ref when trying to read it iff the GPU is already > awake. > > This avoids spurious wake ups (eg: from applets). > > We use pm_runtime_get_if_in_active(ign_usage_count=true) because we care > about "is the GPU awake?" not about "is the GPU awake and something else > prevents suspend?". One possible downside of this approach is - let's say an application tries this way on a BACO enabled device (device is visible on bus, but powered off due to runtime PM) Get current clock level If (success) && (not desired clock level) Set clock level Submit some jobs to run at set clock level This type of approach won't work since get clock level() itself will fail. That said, I don't know if there is any app out there does something like this. Thanks, Lijo > > Tested-by: Mario Limonciello > Signed-off-by: Pierre-Eric Pelloux-Prayer > --- > drivers/gpu/drm/amd/pm/amdgpu_pm.c | 138 ++--- > 1 file changed, 69 insertions(+), 69 deletions(-) > > diff --git a/drivers/gpu/drm/amd/pm/amdgpu_pm.c > b/drivers/gpu/drm/amd/pm/amdgpu_pm.c > index c8f34b1a4d8e..f1f339b75380 100644 > --- a/drivers/gpu/drm/amd/pm/amdgpu_pm.c > +++ b/drivers/gpu/drm/amd/pm/amdgpu_pm.c > @@ -145,9 +145,9 @@ static ssize_t amdgpu_get_power_dpm_state(struct device > *dev, > if (adev->in_suspend && !adev->in_runpm) > return -EPERM; > > - ret = pm_runtime_resume_and_get(ddev->dev); > - if (ret < 0) > - return ret; > + ret = pm_runtime_get_if_active(ddev->dev, true); > + if (ret <= 0) > + return ret ?: -EPERM; > > amdgpu_dpm_get_current_power_state(adev, &pm); > > @@ -268,9 +268,9 @@ static ssize_t > amdgpu_get_power_dpm_force_performance_level(struct device *dev, > if (adev->in_suspend && !adev->in_runpm) > return -EPERM; > > - ret = pm_runtime_resume_and_get(ddev->dev); > - if (ret < 0) > - return ret; > + ret = pm_runtime_get_if_active(ddev->dev, true); > + if (ret <= 0) > + return ret ?: -EPERM; > > level = amdgpu_dpm_get_performance_level(adev); > > @@ -364,9 +364,9 @@ static ssize_t amdgpu_get_pp_num_states(struct device > *dev, > if (adev->in_suspend && !adev->in_runpm) > return -EPERM; > > - ret = pm_runtime_resume_and_get(ddev->dev); > - if (ret < 0) > - return ret; > + ret = pm_runtime_get_if_active(ddev->dev, true); > + if (ret <= 0) > + return ret ?: -EPERM; > > if (amdgpu_dpm_get_pp_num_states(adev, &data)) > memset(&data, 0, sizeof(data)); > @@ -399,9 +399,9 @@ static ssize_t amdgpu_get_pp_cur_state(struct device *dev, > if (adev->in_suspend && !adev->in_runpm) > return -EPERM; > > - ret = pm_runtime_resume_and_get(ddev->dev); > - if (ret < 0) > - return ret; > + ret = pm_runtime_get_if_active(ddev->dev, true); > + if (ret <= 0) > + return ret ?: -EPERM; > > amdgpu_dpm_get_current_power_state(adev, &pm); > > @@ -526,9 +526,9 @@ static ssize_t amdgpu_get_pp_table(struct device *dev, > if (adev->in_suspend && !adev->in_runpm) > return -EPERM; > > - ret = pm_runtime_resume_and_get(ddev->dev); > - if (ret < 0) > - return ret; > + ret = pm_runtime_get_if_active(ddev->dev, true); > + if (ret <= 0) > + return ret ?: -EPERM; > > size = amdgpu_dpm_get_pp_table(adev, &table); > > @@ -840,9 +840,9 @@ static ssize_t amdgpu_get_pp_od_clk_voltage(struct device > *dev, > if (adev->in_suspend && !adev->in_runpm) > return -EPERM; > > - ret = pm_runtime_resume_and_get(ddev->dev); > - if (ret < 0) > - return ret; > + ret = pm_runtime_get_if_active(ddev->dev, true); > + if (ret <= 0) > + return ret ?: -EPERM; > > for (clk_index = 0 ; clk_index < 6 ; clk_index++) { > ret = amdgpu_dpm_emit_clock_levels(adev, od_clocks[clk_index], > buf, &size); > @@ -930,9 +930,9 @@ static ssize_t amdgpu_get_pp_features(struct device *dev, > if (adev->in_suspend && !adev->in_runpm) > return -EPERM; > > - ret = pm_runtime_resume_and_get(ddev->dev); > - if (ret < 0) > - return ret; > + ret = pm_runtime_get_if_active(ddev->dev, true); > + if (ret <= 0) > + return ret ?: -EPERM; > > size = amdgpu_dpm_get_ppfeature_status(adev, buf); > if (size <= 0) > @@ -996,9 +996,9 @@ static ssize_t amdgpu_get_pp_dpm_clock(struct device *dev, > if (adev->in_suspend && !adev->in_runpm) > return -EPERM; > > - ret = pm_runtime_resume_and_get(ddev->dev); > - if (ret < 0) > - return ret; > + ret = pm_runtim
Re: [PATCH v2 01/08] drm/amdgpu: Implement userqueue signal/wait IOCTL
Am 25.09.24 um 21:59 schrieb Arunpravin Paneer Selvam: [SNIP] +int amdgpu_userq_wait_ioctl(struct drm_device *dev, void *data, + struct drm_file *filp) +{ + struct drm_amdgpu_userq_fence_info *fence_info = NULL; + struct drm_amdgpu_userq_wait *wait_info = data; + u32 *syncobj_handles, *bo_handles; + struct dma_fence **fences = NULL; + u32 num_syncobj, num_bo_handles; + struct drm_gem_object **gobj; + struct drm_exec exec; + int r, i, entry, cnt; + u64 num_fences = 0; + + num_bo_handles = wait_info->num_bo_handles; + bo_handles = memdup_user(u64_to_user_ptr(wait_info->bo_handles_array), +sizeof(u32) * num_bo_handles); + if (IS_ERR(bo_handles)) + return PTR_ERR(bo_handles); + + num_syncobj = wait_info->num_syncobj_handles; + syncobj_handles = memdup_user(u64_to_user_ptr(wait_info->syncobj_handles_array), + sizeof(u32) * num_syncobj); + if (IS_ERR(syncobj_handles)) { + r = PTR_ERR(syncobj_handles); + goto free_bo_handles; + } + + /* Array of GEM object handles */ + gobj = kmalloc_array(num_bo_handles, sizeof(*gobj), GFP_KERNEL); + if (!gobj) { + r = -ENOMEM; + goto free_syncobj_handles; + } + + for (entry = 0; entry < num_bo_handles; entry++) { + gobj[entry] = drm_gem_object_lookup(filp, bo_handles[entry]); + if (!gobj[entry]) { + r = -ENOENT; + goto put_gobj; + } + } + + drm_exec_init(&exec, DRM_EXEC_INTERRUPTIBLE_WAIT, 0); + drm_exec_until_all_locked(&exec) { + r = drm_exec_prepare_array(&exec, gobj, num_bo_handles, 0); + drm_exec_retry_on_contention(&exec); + if (r) { + drm_exec_fini(&exec); + goto put_gobj; + } + } + + if (!wait_info->num_fences) { + /* Count syncobj's fence */ + for (i = 0; i < num_syncobj; i++) { + struct dma_fence *fence; + + r = drm_syncobj_find_fence(filp, syncobj_handles[i], + 0, 0, &fence); + dma_fence_put(fence); + + if (r || !fence) + continue; + + num_fences++; + } + + /* Count GEM objects fence */ + for (i = 0; i < num_bo_handles; i++) { + struct dma_resv_iter resv_cursor; + struct dma_fence *fence; + + dma_resv_for_each_fence(&resv_cursor, gobj[i]->resv, + dma_resv_usage_rw(wait_info->bo_wait_flags & + AMDGPU_USERQ_BO_WRITE), fence) + num_fences++; We should probably adjust the UAPI here once more. The problem is that we only provide the AMDGPU_USERQ_BO_WRITE for the whole IOCTL instead of per BO. So the best approach would probably be to drop the AMDGPU_USERQ_BO_WRITE flag and split up the array of BOs into readers and writers. Can you work on that Arun? Shouldn't be more than a bit typing exercise. Thanks, Christian.
Re: [PATCH v2 01/08] drm/amdgpu: Implement userqueue signal/wait IOCTL
Hi Christian, On 9/26/2024 2:57 PM, Christian König wrote: Am 25.09.24 um 21:59 schrieb Arunpravin Paneer Selvam: [SNIP] +int amdgpu_userq_wait_ioctl(struct drm_device *dev, void *data, + struct drm_file *filp) +{ + struct drm_amdgpu_userq_fence_info *fence_info = NULL; + struct drm_amdgpu_userq_wait *wait_info = data; + u32 *syncobj_handles, *bo_handles; + struct dma_fence **fences = NULL; + u32 num_syncobj, num_bo_handles; + struct drm_gem_object **gobj; + struct drm_exec exec; + int r, i, entry, cnt; + u64 num_fences = 0; + + num_bo_handles = wait_info->num_bo_handles; + bo_handles = memdup_user(u64_to_user_ptr(wait_info->bo_handles_array), + sizeof(u32) * num_bo_handles); + if (IS_ERR(bo_handles)) + return PTR_ERR(bo_handles); + + num_syncobj = wait_info->num_syncobj_handles; + syncobj_handles = memdup_user(u64_to_user_ptr(wait_info->syncobj_handles_array), + sizeof(u32) * num_syncobj); + if (IS_ERR(syncobj_handles)) { + r = PTR_ERR(syncobj_handles); + goto free_bo_handles; + } + + /* Array of GEM object handles */ + gobj = kmalloc_array(num_bo_handles, sizeof(*gobj), GFP_KERNEL); + if (!gobj) { + r = -ENOMEM; + goto free_syncobj_handles; + } + + for (entry = 0; entry < num_bo_handles; entry++) { + gobj[entry] = drm_gem_object_lookup(filp, bo_handles[entry]); + if (!gobj[entry]) { + r = -ENOENT; + goto put_gobj; + } + } + + drm_exec_init(&exec, DRM_EXEC_INTERRUPTIBLE_WAIT, 0); + drm_exec_until_all_locked(&exec) { + r = drm_exec_prepare_array(&exec, gobj, num_bo_handles, 0); + drm_exec_retry_on_contention(&exec); + if (r) { + drm_exec_fini(&exec); + goto put_gobj; + } + } + + if (!wait_info->num_fences) { + /* Count syncobj's fence */ + for (i = 0; i < num_syncobj; i++) { + struct dma_fence *fence; + + r = drm_syncobj_find_fence(filp, syncobj_handles[i], + 0, 0, &fence); + dma_fence_put(fence); + + if (r || !fence) + continue; + + num_fences++; + } + + /* Count GEM objects fence */ + for (i = 0; i < num_bo_handles; i++) { + struct dma_resv_iter resv_cursor; + struct dma_fence *fence; + + dma_resv_for_each_fence(&resv_cursor, gobj[i]->resv, + dma_resv_usage_rw(wait_info->bo_wait_flags & + AMDGPU_USERQ_BO_WRITE), fence) + num_fences++; We should probably adjust the UAPI here once more. The problem is that we only provide the AMDGPU_USERQ_BO_WRITE for the whole IOCTL instead of per BO. So the best approach would probably be to drop the AMDGPU_USERQ_BO_WRITE flag and split up the array of BOs into readers and writers. Can you work on that Arun? Shouldn't be more than a bit typing exercise. Sure, I will modify and send the next version of this file. Thanks, Arun. Thanks, Christian.
Re: [PATCH 6/7] drm/amdgpu: Check gmc requirement for reset on init
On 9/26/2024 2:31 PM, Xu, Feifei wrote: > [AMD Official Use Only - AMD Internal Distribution Only] > >>> + refresh = (adev->init_lvl->level != AMDGPU_INIT_LEVEL_MINIMAL_XGMI) && >>> + (adev->gmc.reset_flags & AMDGPU_GMC_INIT_RESET_NPS); > > Is there a corner case that reloading with a different version tos and > refreshing nps change co-exist? > I guess you are referring to the below corner case 1) Place NPS request 2) Unload Driver 3) Reinstall driver with a different TOS (possible but quite unlikely) 4) Driver reload 5) Driver checks TOS version first and goes for a reset 6) reset_flag of GMC is not set, hence it doesn't refresh the NPS range. I think changing the order in soc15_need_reset_on_init() to check for NPS request before TOS version check will solve this. Thanks, Lijo > Thanks, > Feifei > > -Original Message- > From: amd-gfx On Behalf Of Lijo Lazar > Sent: Tuesday, September 24, 2024 1:57 PM > To: amd-gfx@lists.freedesktop.org > Cc: Zhang, Hawking ; Deucher, Alexander > ; Koenig, Christian ; > Bhardwaj, Rajneesh ; Errabolu, Ramesh > > Subject: [PATCH 6/7] drm/amdgpu: Check gmc requirement for reset on init > > Add a callback to check if there is any condition detected by GMC block for > reset on init. One case is if a pending NPS change request is detected. If > reset is done because of NPS switch, refresh NPS info from discovery table. > > Signed-off-by: Lijo Lazar > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 13 - > drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h | 5 + > drivers/gpu/drm/amd/amdgpu/soc15.c | 2 ++ > 3 files changed, 19 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c > index 21f1e65c9dc9..011fe3a847d0 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c > @@ -1261,12 +1261,15 @@ int amdgpu_gmc_get_nps_memranges(struct amdgpu_device > *adev, > struct amdgpu_gmc_memrange *ranges; > int range_cnt, ret, i, j; > uint32_t nps_type; > + bool refresh; > > if (!mem_ranges) > return -EINVAL; > > + refresh = (adev->init_lvl->level != AMDGPU_INIT_LEVEL_MINIMAL_XGMI) && > + (adev->gmc.reset_flags & AMDGPU_GMC_INIT_RESET_NPS); > ret = amdgpu_discovery_get_nps_info(adev, &nps_type, &ranges, > - &range_cnt, false); > + &range_cnt, refresh); > > if (ret) > return ret; > @@ -1392,3 +1395,11 @@ void amdgpu_gmc_prepare_nps_mode_change(struct > amdgpu_device *adev) > adev->dev, > "NPS mode change request done, reload driver to > complete the change\n"); } > + > +bool amdgpu_gmc_need_reset_on_init(struct amdgpu_device *adev) { > + if (adev->gmc.gmc_funcs->need_reset_on_init) > + return adev->gmc.gmc_funcs->need_reset_on_init(adev); > + > + return false; > +} > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h > b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h > index b13d6adb5efd..d4cd247fe574 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h > @@ -78,6 +78,8 @@ enum amdgpu_memory_partition { > BIT(AMDGPU_NPS3_PARTITION_MODE) | BIT(AMDGPU_NPS4_PARTITION_MODE) | \ > BIT(AMDGPU_NPS6_PARTITION_MODE) | BIT(AMDGPU_NPS8_PARTITION_MODE)) > > +#define AMDGPU_GMC_INIT_RESET_NPS BIT(0) > + > /* > * GMC page fault information > */ > @@ -169,6 +171,7 @@ struct amdgpu_gmc_funcs { > /* Request NPS mode */ > int (*request_mem_partition_mode)(struct amdgpu_device *adev, > int nps_mode); > + bool (*need_reset_on_init)(struct amdgpu_device *adev); > }; > > struct amdgpu_xgmi_ras { > @@ -314,6 +317,7 @@ struct amdgpu_gmc { > const struct amdgpu_gmc_funcs *gmc_funcs; > enum amdgpu_memory_partitionrequested_nps_mode; > uint32_t supported_nps_modes; > + uint32_t reset_flags; > > struct amdgpu_xgmi xgmi; > struct amdgpu_irq_src ecc_irq; > @@ -468,5 +472,6 @@ int amdgpu_gmc_get_nps_memranges(struct amdgpu_device > *adev, int amdgpu_gmc_request_memory_partition(struct amdgpu_device *adev, > int nps_mode); > void amdgpu_gmc_prepare_nps_mode_change(struct amdgpu_device *adev); > +bool amdgpu_gmc_need_reset_on_init(struct amdgpu_device *adev); > > #endif > diff --git a/drivers/gpu/drm/amd/amdgpu/soc15.c > b/drivers/gpu/drm/amd/amdgpu/soc15.c > index 619933f252aa..97ca4931a7ef 100644 > --- a/drivers/gpu/drm/amd/amdgpu/soc15.c > +++ b/drivers/gpu/drm/amd/amdgpu/soc15.c > @@ -833,6 +833,8 @@ static bool soc15_need_reset_on_init(struct amdgpu_device > *adev) > > if
Re: [PATCH v1 6/9] drm/amd/pm: stop extra checks for runtime pm state
On 9/25/2024 7:59 PM, Pierre-Eric Pelloux-Prayer wrote: > > > Le 25/09/2024 à 15:37, Lazar, Lijo a écrit : >> >> >> On 9/25/2024 1:24 PM, Pierre-Eric Pelloux-Prayer wrote: >>> pm_runtime_get_if_in_use already checks if the GPU is active, >>> so there's no need for manually checking runtimepm status: >>> >>> if (adev->in_suspend && !adev->in_runpm) >>> return -EPERM; >>> >>> Tested-by: Mario Limonciello >>> Signed-off-by: Pierre-Eric Pelloux-Prayer >>> >>> --- >>> drivers/gpu/drm/amd/pm/amdgpu_pm.c | 46 -- >>> 1 file changed, 46 deletions(-) >>> >>> diff --git a/drivers/gpu/drm/amd/pm/amdgpu_pm.c >>> b/drivers/gpu/drm/amd/pm/amdgpu_pm.c >>> index f1f339b75380..13be5e017a01 100644 >>> --- a/drivers/gpu/drm/amd/pm/amdgpu_pm.c >>> +++ b/drivers/gpu/drm/amd/pm/amdgpu_pm.c >>> @@ -142,8 +142,6 @@ static ssize_t amdgpu_get_power_dpm_state(struct >>> device *dev, >>> if (amdgpu_in_reset(adev)) >>> return -EPERM; >>> - if (adev->in_suspend && !adev->in_runpm) >>> - return -EPERM; >>> >> >> I believe this check is for accesses before the device is fully resumed >> from a suspend sequence. That is not tied to runtime PM. > > AFAICT in_suspend / in_runpm are only set from resume / suspend > sequences, so checking runtime_status != RPM_ACTIVE like > pm_runtime_get_if_in_use does should provide the same result. > > (= during resume the device status is RPM_RESUMING) > > Pierre-Eric > On devices whose runtime PM is forbidden, I'm not sure if it goes through RPM_ state changes or just statically remains at RPM_ACTIVE. Regardless, these checks are removed only for sys/debug fs attributes. Hence as Alex mentioned this access check was not required in the first place. Thanks, Lijo > >> >> Thanks, >> Lijo >> >>> ret = pm_runtime_get_if_active(ddev->dev, true); >>> if (ret <= 0) >>> @@ -265,8 +263,6 @@ static ssize_t >>> amdgpu_get_power_dpm_force_performance_level(struct device *dev, >>> if (amdgpu_in_reset(adev)) >>> return -EPERM; >>> - if (adev->in_suspend && !adev->in_runpm) >>> - return -EPERM; >>> ret = pm_runtime_get_if_active(ddev->dev, true); >>> if (ret <= 0) >>> @@ -361,8 +357,6 @@ static ssize_t amdgpu_get_pp_num_states(struct >>> device *dev, >>> if (amdgpu_in_reset(adev)) >>> return -EPERM; >>> - if (adev->in_suspend && !adev->in_runpm) >>> - return -EPERM; >>> ret = pm_runtime_get_if_active(ddev->dev, true); >>> if (ret <= 0) >>> @@ -396,8 +390,6 @@ static ssize_t amdgpu_get_pp_cur_state(struct >>> device *dev, >>> if (amdgpu_in_reset(adev)) >>> return -EPERM; >>> - if (adev->in_suspend && !adev->in_runpm) >>> - return -EPERM; >>> ret = pm_runtime_get_if_active(ddev->dev, true); >>> if (ret <= 0) >>> @@ -523,8 +515,6 @@ static ssize_t amdgpu_get_pp_table(struct device >>> *dev, >>> if (amdgpu_in_reset(adev)) >>> return -EPERM; >>> - if (adev->in_suspend && !adev->in_runpm) >>> - return -EPERM; >>> ret = pm_runtime_get_if_active(ddev->dev, true); >>> if (ret <= 0) >>> @@ -837,8 +827,6 @@ static ssize_t >>> amdgpu_get_pp_od_clk_voltage(struct device *dev, >>> if (amdgpu_in_reset(adev)) >>> return -EPERM; >>> - if (adev->in_suspend && !adev->in_runpm) >>> - return -EPERM; >>> ret = pm_runtime_get_if_active(ddev->dev, true); >>> if (ret <= 0) >>> @@ -927,8 +915,6 @@ static ssize_t amdgpu_get_pp_features(struct >>> device *dev, >>> if (amdgpu_in_reset(adev)) >>> return -EPERM; >>> - if (adev->in_suspend && !adev->in_runpm) >>> - return -EPERM; >>> ret = pm_runtime_get_if_active(ddev->dev, true); >>> if (ret <= 0) >>> @@ -993,8 +979,6 @@ static ssize_t amdgpu_get_pp_dpm_clock(struct >>> device *dev, >>> if (amdgpu_in_reset(adev)) >>> return -EPERM; >>> - if (adev->in_suspend && !adev->in_runpm) >>> - return -EPERM; >>> ret = pm_runtime_get_if_active(ddev->dev, true); >>> if (ret <= 0) >>> @@ -1242,8 +1226,6 @@ static ssize_t amdgpu_get_pp_sclk_od(struct >>> device *dev, >>> if (amdgpu_in_reset(adev)) >>> return -EPERM; >>> - if (adev->in_suspend && !adev->in_runpm) >>> - return -EPERM; >>> ret = pm_runtime_get_if_active(ddev->dev, true); >>> if (ret <= 0) >>> @@ -1299,8 +1281,6 @@ static ssize_t amdgpu_get_pp_mclk_od(struct >>> device *dev, >>> if (amdgpu_in_reset(adev)) >>> return -EPERM; >>> - if (adev->in_suspend && !adev->in_runpm) >>> - return -EPERM; >>> ret = pm_runtime_get_if_active(ddev->dev, true); >>> if (ret <= 0) >>> @@ -1376,8 +1356,6 @@ static ssize_t >>> amdgpu_get_pp_power_profile_mode(struct device *dev, >>> if (amdgpu_in_reset(adev)) >>> return -EPERM; >>> - if (adev->in_suspend && !adev->in_runpm) >>> -
Re: [PATCH v2 01/08] drm/amdgpu: Implement userqueue signal/wait IOCTL
Hi Christian, On 9/26/2024 3:04 PM, Christian König wrote: Am 26.09.24 um 11:31 schrieb Paneer Selvam, Arunpravin: Hi Christian, On 9/26/2024 2:57 PM, Christian König wrote: Am 25.09.24 um 21:59 schrieb Arunpravin Paneer Selvam: [SNIP] +int amdgpu_userq_wait_ioctl(struct drm_device *dev, void *data, + struct drm_file *filp) +{ + struct drm_amdgpu_userq_fence_info *fence_info = NULL; + struct drm_amdgpu_userq_wait *wait_info = data; + u32 *syncobj_handles, *bo_handles; + struct dma_fence **fences = NULL; + u32 num_syncobj, num_bo_handles; + struct drm_gem_object **gobj; + struct drm_exec exec; + int r, i, entry, cnt; + u64 num_fences = 0; + + num_bo_handles = wait_info->num_bo_handles; + bo_handles = memdup_user(u64_to_user_ptr(wait_info->bo_handles_array), + sizeof(u32) * num_bo_handles); + if (IS_ERR(bo_handles)) + return PTR_ERR(bo_handles); + + num_syncobj = wait_info->num_syncobj_handles; + syncobj_handles = memdup_user(u64_to_user_ptr(wait_info->syncobj_handles_array), + sizeof(u32) * num_syncobj); + if (IS_ERR(syncobj_handles)) { + r = PTR_ERR(syncobj_handles); + goto free_bo_handles; + } + + /* Array of GEM object handles */ + gobj = kmalloc_array(num_bo_handles, sizeof(*gobj), GFP_KERNEL); + if (!gobj) { + r = -ENOMEM; + goto free_syncobj_handles; + } + + for (entry = 0; entry < num_bo_handles; entry++) { + gobj[entry] = drm_gem_object_lookup(filp, bo_handles[entry]); + if (!gobj[entry]) { + r = -ENOENT; + goto put_gobj; + } + } + + drm_exec_init(&exec, DRM_EXEC_INTERRUPTIBLE_WAIT, 0); + drm_exec_until_all_locked(&exec) { + r = drm_exec_prepare_array(&exec, gobj, num_bo_handles, 0); + drm_exec_retry_on_contention(&exec); + if (r) { + drm_exec_fini(&exec); + goto put_gobj; + } + } + + if (!wait_info->num_fences) { + /* Count syncobj's fence */ + for (i = 0; i < num_syncobj; i++) { + struct dma_fence *fence; + + r = drm_syncobj_find_fence(filp, syncobj_handles[i], + 0, 0, &fence); + dma_fence_put(fence); + + if (r || !fence) + continue; + + num_fences++; + } + + /* Count GEM objects fence */ + for (i = 0; i < num_bo_handles; i++) { + struct dma_resv_iter resv_cursor; + struct dma_fence *fence; + + dma_resv_for_each_fence(&resv_cursor, gobj[i]->resv, + dma_resv_usage_rw(wait_info->bo_wait_flags & + AMDGPU_USERQ_BO_WRITE), fence) + num_fences++; We should probably adjust the UAPI here once more. The problem is that we only provide the AMDGPU_USERQ_BO_WRITE for the whole IOCTL instead of per BO. So the best approach would probably be to drop the AMDGPU_USERQ_BO_WRITE flag and split up the array of BOs into readers and writers. Can you work on that Arun? Shouldn't be more than a bit typing exercise. Sure, I will modify and send the next version of this file. Thanks. In the meantime I'm going to review the rest of the series, so there could be more comments. But please update the UAPI first. Can we have the bo_handles_read_array, num_read_bo_handles, bo_handles_write_array, num_write_bo_handles in both signal IOCTL and wait IOCTL? Thanks, Arun. Regards, Christian. Thanks, Arun. Thanks, Christian.
Re: [PATCH v7 09/10] drm/amd/display: remove dc_edid handler from dm_helpers_parse_edid_caps
On 25/09/2024 14:55, Mario Limonciello wrote: Alex, Unfortunately I can't reproduce the regression on the APU I tried. However I do have a suspicion on a fix. Can you see if this helps? If it does, we can squash it in. diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c index bf847ac55475..e75cd527d753 100644 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c @@ -7008,6 +7008,7 @@ static void amdgpu_dm_connector_destroy(struct drm_connector *connector) kfree(aconnector->i2c); } kfree(aconnector->dm_dp_aux.aux.name); + drm_edid_free(aconnector->drm_edid); kfree(connector); } If that doesn't help, I did test dropping this patch and it doesn't affect the last patch in the series, that one still works so I'm fine with dropping it and we can follow up later. yes, I agree with Mario's proposal. On 9/25/2024 12:06, Alex Hung wrote: Mario and Melissa, This patch causes a regrerssion on 7900 XTX in an IGT test: amd_mem_leak's connector-suspend-resume. Is this patch necessary on this series or is it independent from other patches, i.e. can it be dropped from this series until fixed?? Cheers, Alex Hung On 9/18/24 15:38, Mario Limonciello wrote: From: Melissa Wen We can parse edid caps from drm_edid and drm_eld and any calls of dm_helpers_parse_edid_caps is made in a state that we have drm_edid set to amdgpu connector. Signed-off-by: Melissa Wen Co-developed-by: Mario Limonciello Signed-off-by: Mario Limonciello --- .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 5 +--- .../amd/display/amdgpu_dm/amdgpu_dm_helpers.c | 30 --- drivers/gpu/drm/amd/display/dc/dm_helpers.h | 1 - .../drm/amd/display/dc/link/link_detection.c | 6 ++-- 4 files changed, 16 insertions(+), 26 deletions(-) diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/ drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c index bd8fb9968c7c..bf847ac55475 100644 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c @@ -7123,10 +7123,7 @@ static void amdgpu_dm_connector_funcs_force(struct drm_connector *connector) memset(&dc_em_sink->edid_caps, 0, sizeof(struct dc_edid_caps)); memmove(dc_em_sink->dc_edid.raw_edid, edid, (edid->extensions + 1) * EDID_LENGTH); - dm_helpers_parse_edid_caps( - dc_link, - &dc_em_sink->dc_edid, - &dc_em_sink->edid_caps); + dm_helpers_parse_edid_caps(dc_link, &dc_em_sink->edid_caps); } } diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c index d43ed3ea000b..8f4be7a1ec45 100644 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c @@ -83,32 +83,24 @@ static void apply_edid_quirks(struct drm_edid_product_id *product_id, * dm_helpers_parse_edid_caps() - Parse edid caps * * @link: current detected link - * @edid: [in] pointer to edid * @edid_caps: [in] pointer to edid caps * * Return: void */ -enum dc_edid_status dm_helpers_parse_edid_caps( - struct dc_link *link, - const struct dc_edid *edid, - struct dc_edid_caps *edid_caps) +enum dc_edid_status dm_helpers_parse_edid_caps(struct dc_link *link, + struct dc_edid_caps *edid_caps) { struct amdgpu_dm_connector *aconnector = link->priv; struct drm_connector *connector = &aconnector->base; const struct drm_edid *drm_edid = aconnector->drm_edid; struct drm_edid_product_id product_id; - struct edid *edid_buf = edid ? (struct edid *) edid->raw_edid : NULL; int sad_count; int i = 0; - enum dc_edid_status result = EDID_OK; - if (!edid_caps || !edid) + if (!edid_caps || !drm_edid) return EDID_BAD_INPUT; - if (!drm_edid_is_valid(edid_buf)) - result = EDID_BAD_CHECKSUM; - drm_edid_get_product_id(drm_edid, &product_id); edid_caps->manufacturer_id = le16_to_cpu(product_id.manufacturer_name); @@ -920,19 +912,23 @@ enum dc_edid_status dm_helpers_read_local_edid( if (!drm_edid) return EDID_NO_RESPONSE; - edid = drm_edid_raw(drm_edid); // FIXME: Get rid of drm_edid_raw() + /* FIXME: Get rid of drm_edid_raw() + * Raw edid is still needed for dm_helpers_dp_write_dpcd() + */ + edid = drm_edid_raw(drm_edid); sink->dc_edid.length = EDID_LENGTH * (edid->extensions + 1); memmove(sink->dc_edid.raw_edid, (uint8_t *)edid, sink- >dc_edid.length); edid_status = dm_helpers_parse_edid_caps( link, - &sink->dc_edid, &sink->edid_caps); -
Re: [PATCH] drm/amdgpu: fix vbios fetching for SR-IOV
On 9/25/2024 11:50 PM, Alex Deucher wrote: > SR-IOV fetches the vbios from VRAM in some cases. > Re-enable the VRAM path for dGPUs and rename the function > to make it clear that it is not IGP specific. > > Fixes: 042658d17a54 ("drm/amdgpu: clean up vbios fetching code") > Signed-off-by: Alex Deucher > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_bios.c | 11 +-- > 1 file changed, 9 insertions(+), 2 deletions(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_bios.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_bios.c > index 46bf623919d7c..45affc02548c1 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_bios.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_bios.c > @@ -87,8 +87,9 @@ static bool check_atom_bios(uint8_t *bios, size_t size) > * part of the system bios. On boot, the system bios puts a > * copy of the igp rom at the start of vram if a discrete card is > * present. > + * For SR-IOV, the vbios image is also put in VRAM in the VF. > */ > -static bool igp_read_bios_from_vram(struct amdgpu_device *adev) > +static bool amdgpu_read_bios_from_vram(struct amdgpu_device *adev) > { > uint8_t __iomem *bios; > resource_size_t vram_base; > @@ -414,7 +415,7 @@ static bool amdgpu_get_bios_apu(struct amdgpu_device > *adev) > goto success; > } > > - if (igp_read_bios_from_vram(adev)) { > + if (amdgpu_read_bios_from_vram(adev)) { > dev_info(adev->dev, "Fetched VBIOS from VRAM BAR\n"); > goto success; > } > @@ -448,6 +449,12 @@ static bool amdgpu_get_bios_dgpu(struct amdgpu_device > *adev) > goto success; > } > > + /* this is required for SR-IOV */ > + if (amdgpu_read_bios_from_vram(adev)) { If this is only expected for VFs, then it's better to add amdgpu_sriov_vf(adev) check for this call. Thanks, Lijo > + dev_info(adev->dev, "Fetched VBIOS from VRAM BAR\n"); > + goto success; > + } > + > if (amdgpu_read_platform_bios(adev)) { > dev_info(adev->dev, "Fetched VBIOS from platform\n"); > goto success;
Re: [PATCH v2 01/08] drm/amdgpu: Implement userqueue signal/wait IOCTL
Am 26.09.24 um 12:26 schrieb Paneer Selvam, Arunpravin: Hi Christian, On 9/26/2024 3:04 PM, Christian König wrote: Am 26.09.24 um 11:31 schrieb Paneer Selvam, Arunpravin: Hi Christian, On 9/26/2024 2:57 PM, Christian König wrote: Am 25.09.24 um 21:59 schrieb Arunpravin Paneer Selvam: [SNIP] +int amdgpu_userq_wait_ioctl(struct drm_device *dev, void *data, + struct drm_file *filp) +{ + struct drm_amdgpu_userq_fence_info *fence_info = NULL; + struct drm_amdgpu_userq_wait *wait_info = data; + u32 *syncobj_handles, *bo_handles; + struct dma_fence **fences = NULL; + u32 num_syncobj, num_bo_handles; + struct drm_gem_object **gobj; + struct drm_exec exec; + int r, i, entry, cnt; + u64 num_fences = 0; + + num_bo_handles = wait_info->num_bo_handles; + bo_handles = memdup_user(u64_to_user_ptr(wait_info->bo_handles_array), + sizeof(u32) * num_bo_handles); + if (IS_ERR(bo_handles)) + return PTR_ERR(bo_handles); + + num_syncobj = wait_info->num_syncobj_handles; + syncobj_handles = memdup_user(u64_to_user_ptr(wait_info->syncobj_handles_array), + sizeof(u32) * num_syncobj); + if (IS_ERR(syncobj_handles)) { + r = PTR_ERR(syncobj_handles); + goto free_bo_handles; + } + + /* Array of GEM object handles */ + gobj = kmalloc_array(num_bo_handles, sizeof(*gobj), GFP_KERNEL); + if (!gobj) { + r = -ENOMEM; + goto free_syncobj_handles; + } + + for (entry = 0; entry < num_bo_handles; entry++) { + gobj[entry] = drm_gem_object_lookup(filp, bo_handles[entry]); + if (!gobj[entry]) { + r = -ENOENT; + goto put_gobj; + } + } + + drm_exec_init(&exec, DRM_EXEC_INTERRUPTIBLE_WAIT, 0); + drm_exec_until_all_locked(&exec) { + r = drm_exec_prepare_array(&exec, gobj, num_bo_handles, 0); + drm_exec_retry_on_contention(&exec); + if (r) { + drm_exec_fini(&exec); + goto put_gobj; + } + } + + if (!wait_info->num_fences) { + /* Count syncobj's fence */ + for (i = 0; i < num_syncobj; i++) { + struct dma_fence *fence; + + r = drm_syncobj_find_fence(filp, syncobj_handles[i], + 0, 0, &fence); + dma_fence_put(fence); + + if (r || !fence) + continue; + + num_fences++; + } + + /* Count GEM objects fence */ + for (i = 0; i < num_bo_handles; i++) { + struct dma_resv_iter resv_cursor; + struct dma_fence *fence; + + dma_resv_for_each_fence(&resv_cursor, gobj[i]->resv, + dma_resv_usage_rw(wait_info->bo_wait_flags & + AMDGPU_USERQ_BO_WRITE), fence) + num_fences++; We should probably adjust the UAPI here once more. The problem is that we only provide the AMDGPU_USERQ_BO_WRITE for the whole IOCTL instead of per BO. So the best approach would probably be to drop the AMDGPU_USERQ_BO_WRITE flag and split up the array of BOs into readers and writers. Can you work on that Arun? Shouldn't be more than a bit typing exercise. Sure, I will modify and send the next version of this file. Thanks. In the meantime I'm going to review the rest of the series, so there could be more comments. But please update the UAPI first. Can we have the bo_handles_read_array, num_read_bo_handles, bo_handles_write_array, num_write_bo_handles in both signal IOCTL and wait IOCTL? I think so, yes. Regards, Christian. Thanks, Arun. Regards, Christian. Thanks, Arun. Thanks, Christian.
[PATCH v2] docs/gpu: ci: update flake tests requirements
Update the documentation to require linking to a relevant GitLab issue for each new flake entry instead of an email report. Added specific GitLab issue URLs for i915, xe and other drivers. Signed-off-by: Vignesh Raman --- v2: - Add gitlab issue link for msm driver. --- Documentation/gpu/automated_testing.rst | 16 +++- 1 file changed, 11 insertions(+), 5 deletions(-) diff --git a/Documentation/gpu/automated_testing.rst b/Documentation/gpu/automated_testing.rst index 2d5a28866afe..f918fe56f2b0 100644 --- a/Documentation/gpu/automated_testing.rst +++ b/Documentation/gpu/automated_testing.rst @@ -67,20 +67,26 @@ Lists the tests that for a given driver on a specific hardware revision are known to behave unreliably. These tests won't cause a job to fail regardless of the result. They will still be run. -Each new flake entry must be associated with a link to the email reporting the -bug to the author of the affected driver, the board name or Device Tree name of -the board, the first kernel version affected, the IGT version used for tests, -and an approximation of the failure rate. +Each new flake entry must include a link to the relevant GitLab issue, the board +name or Device Tree name, the first kernel version affected, the IGT version used +for tests and an approximation of the failure rate. They should be provided under the following format:: - # Bug Report: $LORE_OR_PATCHWORK_URL + # Bug Report: $GITLAB_ISSUE # Board Name: broken-board.dtb # Linux Version: 6.6-rc1 # IGT Version: 1.28-gd2af13d9f # Failure Rate: 100 flaky-test +The GitLab issue must include the logs and the pipeline link. Use the appropriate +link below to create an issue. +https://gitlab.freedesktop.org/drm/i915/kernel/-/issues for i915 driver +https://gitlab.freedesktop.org/drm/xe/kernel/-/issues for xe driver +https://gitlab.freedesktop.org/drm/msm/-/issues for msm driver +https://gitlab.freedesktop.org/drm/misc/kernel/-/issues for other drivers + drivers/gpu/drm/ci/${DRIVER_NAME}-${HW_REVISION}-skips.txt --- -- 2.43.0
Re: [PATCH v2] docs/gpu: ci: update flake tests requirements
On Thu, Sep 26, 2024 at 12:36:49PM GMT, Vignesh Raman wrote: > Update the documentation to require linking to a relevant GitLab > issue for each new flake entry instead of an email report. Added > specific GitLab issue URLs for i915, xe and other drivers. > > Signed-off-by: Vignesh Raman > --- > > v2: > - Add gitlab issue link for msm driver. > > --- > Documentation/gpu/automated_testing.rst | 16 +++- > 1 file changed, 11 insertions(+), 5 deletions(-) > > diff --git a/Documentation/gpu/automated_testing.rst > b/Documentation/gpu/automated_testing.rst > index 2d5a28866afe..f918fe56f2b0 100644 > --- a/Documentation/gpu/automated_testing.rst > +++ b/Documentation/gpu/automated_testing.rst > @@ -67,20 +67,26 @@ Lists the tests that for a given driver on a specific > hardware revision are > known to behave unreliably. These tests won't cause a job to fail regardless > of > the result. They will still be run. > > -Each new flake entry must be associated with a link to the email reporting > the > -bug to the author of the affected driver, the board name or Device Tree name > of > -the board, the first kernel version affected, the IGT version used for tests, > -and an approximation of the failure rate. > +Each new flake entry must include a link to the relevant GitLab issue, the > board > +name or Device Tree name, the first kernel version affected, the IGT version > used > +for tests and an approximation of the failure rate. > > They should be provided under the following format:: > > - # Bug Report: $LORE_OR_PATCHWORK_URL > + # Bug Report: $GITLAB_ISSUE ># Board Name: broken-board.dtb ># Linux Version: 6.6-rc1 ># IGT Version: 1.28-gd2af13d9f ># Failure Rate: 100 >flaky-test > > +The GitLab issue must include the logs and the pipeline link. Use the > appropriate > +link below to create an issue. > +https://gitlab.freedesktop.org/drm/i915/kernel/-/issues for i915 driver > +https://gitlab.freedesktop.org/drm/xe/kernel/-/issues for xe driver > +https://gitlab.freedesktop.org/drm/msm/-/issues for msm driver > +https://gitlab.freedesktop.org/drm/misc/kernel/-/issues for other drivers > + I can't comment for the others, but drm-misc at least still requires reporting issues by mail, so, no, sorry, we can't switch to gitlab only for now. Maxime signature.asc Description: PGP signature
Re: [PATCH v2] docs/gpu: ci: update flake tests requirements
On Thu, Sep 26, 2024 at 12:36:49PM GMT, Vignesh Raman wrote: > Update the documentation to require linking to a relevant GitLab > issue for each new flake entry instead of an email report. Added > specific GitLab issue URLs for i915, xe and other drivers. > > Signed-off-by: Vignesh Raman > --- > > v2: > - Add gitlab issue link for msm driver. > > --- > Documentation/gpu/automated_testing.rst | 16 +++- > 1 file changed, 11 insertions(+), 5 deletions(-) > > diff --git a/Documentation/gpu/automated_testing.rst > b/Documentation/gpu/automated_testing.rst > index 2d5a28866afe..f918fe56f2b0 100644 > --- a/Documentation/gpu/automated_testing.rst > +++ b/Documentation/gpu/automated_testing.rst > @@ -67,20 +67,26 @@ Lists the tests that for a given driver on a specific > hardware revision are > known to behave unreliably. These tests won't cause a job to fail regardless > of > the result. They will still be run. > > -Each new flake entry must be associated with a link to the email reporting > the > -bug to the author of the affected driver, the board name or Device Tree name > of > -the board, the first kernel version affected, the IGT version used for tests, > -and an approximation of the failure rate. > +Each new flake entry must include a link to the relevant GitLab issue, the > board > +name or Device Tree name, the first kernel version affected, the IGT version > used > +for tests and an approximation of the failure rate. > > They should be provided under the following format:: > > - # Bug Report: $LORE_OR_PATCHWORK_URL > + # Bug Report: $GITLAB_ISSUE ># Board Name: broken-board.dtb ># Linux Version: 6.6-rc1 ># IGT Version: 1.28-gd2af13d9f ># Failure Rate: 100 >flaky-test > > +The GitLab issue must include the logs and the pipeline link. Use the > appropriate > +link below to create an issue. > +https://gitlab.freedesktop.org/drm/i915/kernel/-/issues for i915 driver > +https://gitlab.freedesktop.org/drm/xe/kernel/-/issues for xe driver > +https://gitlab.freedesktop.org/drm/msm/-/issues for msm driver Ack for the MSM part. > +https://gitlab.freedesktop.org/drm/misc/kernel/-/issues for other drivers -- With best wishes Dmitry
[PATCH v2 1/4] drm/amdgpu/sdma5: split out per instance resume function
From: Jiadong Zhu Extract the resume sequence from sdma_v5_0_gfx_resume for starting/restarting an individual instance. Signed-off-by: Jiadong Zhu Acked-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c | 253 ++--- 1 file changed, 138 insertions(+), 115 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c b/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c index 3e48ea38385d..e813da1e48aa 100644 --- a/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c @@ -705,14 +705,16 @@ static void sdma_v5_0_enable(struct amdgpu_device *adev, bool enable) } /** - * sdma_v5_0_gfx_resume - setup and start the async dma engines + * sdma_v5_0_gfx_resume_instance - start/restart a certain sdma engine * * @adev: amdgpu_device pointer + * @i: instance + * @restore: used to restore wptr when restart * - * Set up the gfx DMA ring buffers and enable them (NAVI10). - * Returns 0 for success, error for failure. + * Set up the gfx DMA ring buffers and enable them. On restart, we will restore wptr and rptr. + * Return 0 for success. */ -static int sdma_v5_0_gfx_resume(struct amdgpu_device *adev) +static int sdma_v5_0_gfx_resume_instance(struct amdgpu_device *adev, int i, bool restore) { struct amdgpu_ring *ring; u32 rb_cntl, ib_cntl; @@ -722,142 +724,163 @@ static int sdma_v5_0_gfx_resume(struct amdgpu_device *adev) u32 temp; u32 wptr_poll_cntl; u64 wptr_gpu_addr; - int i, r; - for (i = 0; i < adev->sdma.num_instances; i++) { - ring = &adev->sdma.instance[i].ring; + ring = &adev->sdma.instance[i].ring; - if (!amdgpu_sriov_vf(adev)) - WREG32(sdma_v5_0_get_reg_offset(adev, i, mmSDMA0_SEM_WAIT_FAIL_TIMER_CNTL), 0); + if (!amdgpu_sriov_vf(adev)) + WREG32(sdma_v5_0_get_reg_offset(adev, i, mmSDMA0_SEM_WAIT_FAIL_TIMER_CNTL), 0); - /* Set ring buffer size in dwords */ - rb_bufsz = order_base_2(ring->ring_size / 4); - rb_cntl = RREG32_SOC15_IP(GC, sdma_v5_0_get_reg_offset(adev, i, mmSDMA0_GFX_RB_CNTL)); - rb_cntl = REG_SET_FIELD(rb_cntl, SDMA0_GFX_RB_CNTL, RB_SIZE, rb_bufsz); + /* Set ring buffer size in dwords */ + rb_bufsz = order_base_2(ring->ring_size / 4); + rb_cntl = RREG32_SOC15_IP(GC, sdma_v5_0_get_reg_offset(adev, i, mmSDMA0_GFX_RB_CNTL)); + rb_cntl = REG_SET_FIELD(rb_cntl, SDMA0_GFX_RB_CNTL, RB_SIZE, rb_bufsz); #ifdef __BIG_ENDIAN - rb_cntl = REG_SET_FIELD(rb_cntl, SDMA0_GFX_RB_CNTL, RB_SWAP_ENABLE, 1); - rb_cntl = REG_SET_FIELD(rb_cntl, SDMA0_GFX_RB_CNTL, - RPTR_WRITEBACK_SWAP_ENABLE, 1); + rb_cntl = REG_SET_FIELD(rb_cntl, SDMA0_GFX_RB_CNTL, RB_SWAP_ENABLE, 1); + rb_cntl = REG_SET_FIELD(rb_cntl, SDMA0_GFX_RB_CNTL, + RPTR_WRITEBACK_SWAP_ENABLE, 1); #endif - WREG32_SOC15_IP(GC, sdma_v5_0_get_reg_offset(adev, i, mmSDMA0_GFX_RB_CNTL), rb_cntl); - - /* Initialize the ring buffer's read and write pointers */ + WREG32_SOC15_IP(GC, sdma_v5_0_get_reg_offset(adev, i, mmSDMA0_GFX_RB_CNTL), rb_cntl); + + /* Initialize the ring buffer's read and write pointers */ + if (restore) { + WREG32_SOC15_IP(GC, sdma_v5_0_get_reg_offset(adev, i, mmSDMA0_GFX_RB_RPTR), lower_32_bits(ring->wptr << 2)); + WREG32_SOC15_IP(GC, sdma_v5_0_get_reg_offset(adev, i, mmSDMA0_GFX_RB_RPTR_HI), upper_32_bits(ring->wptr << 2)); + WREG32_SOC15_IP(GC, sdma_v5_0_get_reg_offset(adev, i, mmSDMA0_GFX_RB_WPTR), lower_32_bits(ring->wptr << 2)); + WREG32_SOC15_IP(GC, sdma_v5_0_get_reg_offset(adev, i, mmSDMA0_GFX_RB_WPTR_HI), upper_32_bits(ring->wptr << 2)); + } else { WREG32_SOC15_IP(GC, sdma_v5_0_get_reg_offset(adev, i, mmSDMA0_GFX_RB_RPTR), 0); WREG32_SOC15_IP(GC, sdma_v5_0_get_reg_offset(adev, i, mmSDMA0_GFX_RB_RPTR_HI), 0); WREG32_SOC15_IP(GC, sdma_v5_0_get_reg_offset(adev, i, mmSDMA0_GFX_RB_WPTR), 0); WREG32_SOC15_IP(GC, sdma_v5_0_get_reg_offset(adev, i, mmSDMA0_GFX_RB_WPTR_HI), 0); - - /* setup the wptr shadow polling */ - wptr_gpu_addr = ring->wptr_gpu_addr; - WREG32_SOC15_IP(GC, sdma_v5_0_get_reg_offset(adev, i, mmSDMA0_GFX_RB_WPTR_POLL_ADDR_LO), - lower_32_bits(wptr_gpu_addr)); - WREG32_SOC15_IP(GC, sdma_v5_0_get_reg_offset(adev, i, mmSDMA0_GFX_RB_WPTR_POLL_ADDR_HI), - upper_32_bits(wptr_gpu_addr)); - wptr_poll_cntl = RREG32_SOC15_IP(GC, sdma_v5_0_get_reg_offset(adev, i, - mmSDMA0_GFX_RB_WPTR_POLL_CNTL)); - wptr_poll_cntl = REG_SET_FIELD(wptr_poll_cntl, -
[PATCH v2 2/4] drm/amdgpu/sdma5: implement ring reset callback for sdma5
From: Jiadong Zhu Implement sdma queue reset callback via MMIO. v2: enter/exit safemode when sdma queue reset. Signed-off-by: Jiadong Zhu --- drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c | 88 ++ 1 file changed, 88 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c b/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c index e813da1e48aa..416273d917e0 100644 --- a/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c @@ -1555,6 +1555,93 @@ static int sdma_v5_0_soft_reset(void *handle) return 0; } +static int sdma_v5_0_reset_queue(struct amdgpu_ring *ring, unsigned int vmid) +{ + struct amdgpu_device *adev = ring->adev; + int i, j, r; + u32 rb_cntl, ib_cntl, f32_cntl, freeze, cntl, preempt, soft_reset, stat1_reg; + + if (amdgpu_sriov_vf(adev)) + return -EINVAL; + + for (i = 0; i < adev->sdma.num_instances; i++) { + if (ring == &adev->sdma.instance[i].ring) + break; + } + + if (i == adev->sdma.num_instances) { + DRM_ERROR("sdma instance not found\n"); + return -EINVAL; + } + + amdgpu_gfx_rlc_enter_safe_mode(adev, 0); + + /* stop queue */ + ib_cntl = RREG32(sdma_v5_0_get_reg_offset(adev, i, mmSDMA0_GFX_IB_CNTL)); + ib_cntl = REG_SET_FIELD(ib_cntl, SDMA0_GFX_IB_CNTL, IB_ENABLE, 0); + WREG32(sdma_v5_0_get_reg_offset(adev, i, mmSDMA0_GFX_IB_CNTL), ib_cntl); + + rb_cntl = RREG32(sdma_v5_0_get_reg_offset(adev, i, mmSDMA0_GFX_RB_CNTL)); + rb_cntl = REG_SET_FIELD(rb_cntl, SDMA0_GFX_RB_CNTL, RB_ENABLE, 0); + WREG32(sdma_v5_0_get_reg_offset(adev, i, mmSDMA0_GFX_RB_CNTL), rb_cntl); + + /* engine stop SDMA1_F32_CNTL.HALT to 1 and SDMAx_FREEZE freeze bit to 1 */ + freeze = RREG32(sdma_v5_0_get_reg_offset(adev, i, mmSDMA0_FREEZE)); + freeze = REG_SET_FIELD(freeze, SDMA0_FREEZE, FREEZE, 1); + WREG32(sdma_v5_0_get_reg_offset(adev, i, mmSDMA0_FREEZE), freeze); + + for (j = 0; j < adev->usec_timeout; j++) { + freeze = RREG32(sdma_v5_0_get_reg_offset(adev, i, mmSDMA0_FREEZE)); + if (REG_GET_FIELD(freeze, SDMA0_FREEZE, FROZEN) & 1) + break; + udelay(1); + } + + /* check sdma copy engine all idle if frozen not received*/ + if (j == adev->usec_timeout) { + stat1_reg = RREG32(sdma_v5_0_get_reg_offset(adev, i, mmSDMA0_STATUS1_REG)); + if ((stat1_reg & 0x3FF) != 0x3FF) { + DRM_ERROR("cannot soft reset as sdma not idle\n"); + r = -ETIMEDOUT; + goto err0; + } + } + + f32_cntl = RREG32(sdma_v5_0_get_reg_offset(adev, i, mmSDMA0_F32_CNTL)); + f32_cntl = REG_SET_FIELD(f32_cntl, SDMA0_F32_CNTL, HALT, 1); + WREG32(sdma_v5_0_get_reg_offset(adev, i, mmSDMA0_F32_CNTL), f32_cntl); + + cntl = RREG32(sdma_v5_0_get_reg_offset(adev, i, mmSDMA0_CNTL)); + cntl = REG_SET_FIELD(cntl, SDMA0_CNTL, UTC_L1_ENABLE, 0); + WREG32(sdma_v5_0_get_reg_offset(adev, i, mmSDMA0_CNTL), cntl); + + /* soft reset SDMA_GFX_PREEMPT.IB_PREEMPT = 0 mmGRBM_SOFT_RESET.SOFT_RESET_SDMA0/1 = 1 */ + preempt = RREG32(sdma_v5_0_get_reg_offset(adev, i, mmSDMA0_GFX_PREEMPT)); + preempt = REG_SET_FIELD(preempt, SDMA0_GFX_PREEMPT, IB_PREEMPT, 0); + WREG32(sdma_v5_0_get_reg_offset(adev, i, mmSDMA0_GFX_PREEMPT), preempt); + + soft_reset = RREG32_SOC15(GC, 0, mmGRBM_SOFT_RESET); + soft_reset |= 1 << GRBM_SOFT_RESET__SOFT_RESET_SDMA0__SHIFT << i; + + WREG32_SOC15(GC, 0, mmGRBM_SOFT_RESET, soft_reset); + + udelay(50); + + soft_reset &= ~(1 << GRBM_SOFT_RESET__SOFT_RESET_SDMA0__SHIFT << i); + WREG32_SOC15(GC, 0, mmGRBM_SOFT_RESET, soft_reset); + + /* unfreeze*/ + freeze = RREG32(sdma_v5_0_get_reg_offset(adev, i, mmSDMA0_FREEZE)); + freeze = REG_SET_FIELD(freeze, SDMA0_FREEZE, FREEZE, 0); + WREG32(sdma_v5_0_get_reg_offset(adev, i, mmSDMA0_FREEZE), freeze); + + r = sdma_v5_0_gfx_resume_instance(adev, i, true); + +err0: + amdgpu_gfx_rlc_exit_safe_mode(adev, 0); + return r; +} + static int sdma_v5_0_ring_preempt_ib(struct amdgpu_ring *ring) { int i, r = 0; @@ -1897,6 +1984,7 @@ static const struct amdgpu_ring_funcs sdma_v5_0_ring_funcs = { .emit_reg_write_reg_wait = sdma_v5_0_ring_emit_reg_write_reg_wait, .init_cond_exec = sdma_v5_0_ring_init_cond_exec, .preempt_ib = sdma_v5_0_ring_preempt_ib, + .reset = sdma_v5_0_reset_queue, }; static void sdma_v5_0_set_ring_funcs(struct amdgpu_device *adev) -- 2.25.1
[PATCH v2 3/4] drm/amdgpu/sdma5.2: split out per instance resume function
From: Jiadong Zhu Extract the resume sequence from sdma_v5_2_gfx_resume for starting/restarting an individual instance. Signed-off-by: Jiadong Zhu Acked-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c | 247 ++--- 1 file changed, 136 insertions(+), 111 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c b/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c index bc9b240a3488..37534c6f431f 100644 --- a/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c @@ -522,14 +522,17 @@ static void sdma_v5_2_enable(struct amdgpu_device *adev, bool enable) } /** - * sdma_v5_2_gfx_resume - setup and start the async dma engines + * sdma_v5_2_gfx_resume_instance - start/restart a certain sdma engine * * @adev: amdgpu_device pointer + * @i: instance + * @restore: used to restore wptr when restart * - * Set up the gfx DMA ring buffers and enable them. - * Returns 0 for success, error for failure. + * Set up the gfx DMA ring buffers and enable them. On restart, we will restore wptr and rptr. + * Return 0 for success. */ -static int sdma_v5_2_gfx_resume(struct amdgpu_device *adev) + +static int sdma_v5_2_gfx_resume_instance(struct amdgpu_device *adev, int i, bool restore) { struct amdgpu_ring *ring; u32 rb_cntl, ib_cntl; @@ -539,139 +542,161 @@ static int sdma_v5_2_gfx_resume(struct amdgpu_device *adev) u32 temp; u32 wptr_poll_cntl; u64 wptr_gpu_addr; - int i, r; - for (i = 0; i < adev->sdma.num_instances; i++) { - ring = &adev->sdma.instance[i].ring; + ring = &adev->sdma.instance[i].ring; - if (!amdgpu_sriov_vf(adev)) - WREG32_SOC15_IP(GC, sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_SEM_WAIT_FAIL_TIMER_CNTL), 0); + if (!amdgpu_sriov_vf(adev)) + WREG32_SOC15_IP(GC, sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_SEM_WAIT_FAIL_TIMER_CNTL), 0); - /* Set ring buffer size in dwords */ - rb_bufsz = order_base_2(ring->ring_size / 4); - rb_cntl = RREG32_SOC15_IP(GC, sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_GFX_RB_CNTL)); - rb_cntl = REG_SET_FIELD(rb_cntl, SDMA0_GFX_RB_CNTL, RB_SIZE, rb_bufsz); + /* Set ring buffer size in dwords */ + rb_bufsz = order_base_2(ring->ring_size / 4); + rb_cntl = RREG32_SOC15_IP(GC, sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_GFX_RB_CNTL)); + rb_cntl = REG_SET_FIELD(rb_cntl, SDMA0_GFX_RB_CNTL, RB_SIZE, rb_bufsz); #ifdef __BIG_ENDIAN - rb_cntl = REG_SET_FIELD(rb_cntl, SDMA0_GFX_RB_CNTL, RB_SWAP_ENABLE, 1); - rb_cntl = REG_SET_FIELD(rb_cntl, SDMA0_GFX_RB_CNTL, - RPTR_WRITEBACK_SWAP_ENABLE, 1); + rb_cntl = REG_SET_FIELD(rb_cntl, SDMA0_GFX_RB_CNTL, RB_SWAP_ENABLE, 1); + rb_cntl = REG_SET_FIELD(rb_cntl, SDMA0_GFX_RB_CNTL, + RPTR_WRITEBACK_SWAP_ENABLE, 1); #endif - WREG32_SOC15_IP(GC, sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_GFX_RB_CNTL), rb_cntl); - - /* Initialize the ring buffer's read and write pointers */ + WREG32_SOC15_IP(GC, sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_GFX_RB_CNTL), rb_cntl); + + /* Initialize the ring buffer's read and write pointers */ + if (restore) { + WREG32_SOC15_IP(GC, sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_GFX_RB_RPTR), lower_32_bits(ring->wptr << 2)); + WREG32_SOC15_IP(GC, sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_GFX_RB_RPTR_HI), upper_32_bits(ring->wptr << 2)); + WREG32_SOC15_IP(GC, sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_GFX_RB_WPTR), lower_32_bits(ring->wptr << 2)); + WREG32_SOC15_IP(GC, sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_GFX_RB_WPTR_HI), upper_32_bits(ring->wptr << 2)); + } else { WREG32_SOC15_IP(GC, sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_GFX_RB_RPTR), 0); WREG32_SOC15_IP(GC, sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_GFX_RB_RPTR_HI), 0); WREG32_SOC15_IP(GC, sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_GFX_RB_WPTR), 0); WREG32_SOC15_IP(GC, sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_GFX_RB_WPTR_HI), 0); + } - /* setup the wptr shadow polling */ - wptr_gpu_addr = ring->wptr_gpu_addr; - WREG32_SOC15_IP(GC, sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_GFX_RB_WPTR_POLL_ADDR_LO), - lower_32_bits(wptr_gpu_addr)); - WREG32_SOC15_IP(GC, sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_GFX_RB_WPTR_POLL_ADDR_HI), - upper_32_bits(wptr_gpu_addr)); - wptr_poll_cntl = RREG32_SOC15_IP(GC, sdma_v5_2_get_reg_offset(adev, i, - mmSDMA0_GFX_RB_WPTR_POLL_CNTL)); - wptr_poll_cntl = REG_SET_FIELD(wptr_poll_cntl, -
[PATCH v2 4/4] drm/amdgpu/sdma5.2: implement ring reset callback for sdma5.2
From: Jiadong Zhu Implement sdma queue reset callback via MMIO. v2: enter/exit safemode for mmio queue reset. Signed-off-by: Jiadong Zhu --- drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c | 91 ++ 1 file changed, 91 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c b/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c index 37534c6f431f..8e8f8be01539 100644 --- a/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c @@ -1424,6 +1424,96 @@ static int sdma_v5_2_wait_for_idle(void *handle) return -ETIMEDOUT; } +static int sdma_v5_2_reset_queue(struct amdgpu_ring *ring, unsigned int vmid) +{ + struct amdgpu_device *adev = ring->adev; + int i, j, r; + u32 rb_cntl, ib_cntl, f32_cntl, freeze, cntl, preempt, soft_reset, stat1_reg; + + if (amdgpu_sriov_vf(adev)) + return -EINVAL; + + for (i = 0; i < adev->sdma.num_instances; i++) { + if (ring == &adev->sdma.instance[i].ring) + break; + } + + if (i == adev->sdma.num_instances) { + DRM_ERROR("sdma instance not found\n"); + return -EINVAL; + } + + amdgpu_gfx_rlc_enter_safe_mode(adev, 0); + + /* stop queue */ + ib_cntl = RREG32(sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_GFX_IB_CNTL)); + ib_cntl = REG_SET_FIELD(ib_cntl, SDMA0_GFX_IB_CNTL, IB_ENABLE, 0); + WREG32(sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_GFX_IB_CNTL), ib_cntl); + + rb_cntl = RREG32(sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_GFX_RB_CNTL)); + rb_cntl = REG_SET_FIELD(rb_cntl, SDMA0_GFX_RB_CNTL, RB_ENABLE, 0); + WREG32(sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_GFX_RB_CNTL), rb_cntl); + + /*engine stop SDMA1_F32_CNTL.HALT to 1 and SDMAx_FREEZE freeze bit to 1 */ + freeze = RREG32(sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_FREEZE)); + freeze = REG_SET_FIELD(freeze, SDMA0_FREEZE, FREEZE, 1); + WREG32(sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_FREEZE), freeze); + + for (j = 0; j < adev->usec_timeout; j++) { + freeze = RREG32(sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_FREEZE)); + + if (REG_GET_FIELD(freeze, SDMA0_FREEZE, FROZEN) & 1) + break; + udelay(1); + } + + + if (j == adev->usec_timeout) { + stat1_reg = RREG32(sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_STATUS1_REG)); + if ((stat1_reg & 0x3FF) != 0x3FF) { + DRM_ERROR("cannot soft reset as sdma not idle\n"); + r = -ETIMEDOUT; + goto err0; + } + } + + f32_cntl = RREG32(sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_F32_CNTL)); + f32_cntl = REG_SET_FIELD(f32_cntl, SDMA0_F32_CNTL, HALT, 1); + WREG32(sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_F32_CNTL), f32_cntl); + + cntl = RREG32(sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_CNTL)); + cntl = REG_SET_FIELD(cntl, SDMA0_CNTL, UTC_L1_ENABLE, 0); + WREG32(sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_CNTL), cntl); + + /* soft reset SDMA_GFX_PREEMPT.IB_PREEMPT = 0 mmGRBM_SOFT_RESET.SOFT_RESET_SDMA0/1 = 1 */ + preempt = RREG32(sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_GFX_PREEMPT)); + preempt = REG_SET_FIELD(preempt, SDMA0_GFX_PREEMPT, IB_PREEMPT, 0); + WREG32(sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_GFX_PREEMPT), preempt); + + soft_reset = RREG32_SOC15(GC, 0, mmGRBM_SOFT_RESET); + soft_reset |= 1 << GRBM_SOFT_RESET__SOFT_RESET_SDMA0__SHIFT << i; + + + WREG32_SOC15(GC, 0, mmGRBM_SOFT_RESET, soft_reset); + + udelay(50); + + soft_reset &= ~(1 << GRBM_SOFT_RESET__SOFT_RESET_SDMA0__SHIFT << i); + + WREG32_SOC15(GC, 0, mmGRBM_SOFT_RESET, soft_reset); + + /* unfreeze and unhalt */ + freeze = RREG32(sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_FREEZE)); + freeze = REG_SET_FIELD(freeze, SDMA0_FREEZE, FREEZE, 0); + WREG32(sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_FREEZE), freeze); + + r = sdma_v5_2_gfx_resume_instance(adev, i, true); + +err0: + amdgpu_gfx_rlc_exit_safe_mode(adev, 0); + return r; +} + static int sdma_v5_2_ring_preempt_ib(struct amdgpu_ring *ring) { int i, r = 0; @@ -1859,6 +1949,7 @@ static const struct amdgpu_ring_funcs sdma_v5_2_ring_funcs = { .emit_reg_write_reg_wait = sdma_v5_2_ring_emit_reg_write_reg_wait, .init_cond_exec = sdma_v5_2_ring_init_cond_exec, .preempt_ib = sdma_v5_2_ring_preempt_ib, + .reset = sdma_v5_2_reset_queue, }; static void sdma_v5_2_set_ring_funcs(struct amdgpu_device *adev) -- 2.25.1
Re: [PATCH v5 41/80] drm/tidss: Run DRM default client setup
On 24/09/2024 10:12, Thomas Zimmermann wrote: Call drm_client_setup() to run the kernel's default client setup for DRM. Set fbdev_probe in struct drm_driver, so that the client setup can start the common fbdev client. The tidss driver specifies a preferred color mode of 32. As this is the default if no format has been given, leave it out entirely. v5: - select DRM_CLIENT_SELECTION Signed-off-by: Thomas Zimmermann Cc: Jyri Sarha Cc: Tomi Valkeinen Acked-by: Javier Martinez Canillas --- drivers/gpu/drm/tidss/Kconfig | 1 + drivers/gpu/drm/tidss/tidss_drv.c | 4 +++- 2 files changed, 4 insertions(+), 1 deletion(-) Tested on AM62 SK. Reviewed-by: Tomi Valkeinen Tomi diff --git a/drivers/gpu/drm/tidss/Kconfig b/drivers/gpu/drm/tidss/Kconfig index 2385c56493b9..31ad582b7602 100644 --- a/drivers/gpu/drm/tidss/Kconfig +++ b/drivers/gpu/drm/tidss/Kconfig @@ -2,6 +2,7 @@ config DRM_TIDSS tristate "DRM Support for TI Keystone" depends on DRM && OF depends on ARM || ARM64 || COMPILE_TEST + select DRM_CLIENT_SELECTION select DRM_KMS_HELPER select DRM_DISPLAY_HELPER select DRM_BRIDGE_CONNECTOR diff --git a/drivers/gpu/drm/tidss/tidss_drv.c b/drivers/gpu/drm/tidss/tidss_drv.c index d15f836dca95..2428b9aaa003 100644 --- a/drivers/gpu/drm/tidss/tidss_drv.c +++ b/drivers/gpu/drm/tidss/tidss_drv.c @@ -11,6 +11,7 @@ #include #include +#include #include #include #include @@ -109,6 +110,7 @@ static const struct drm_driver tidss_driver = { .fops = &tidss_fops, .release= tidss_release, DRM_GEM_DMA_DRIVER_OPS_VMAP, + DRM_FBDEV_DMA_DRIVER_OPS, .name = "tidss", .desc = "TI Keystone DSS", .date = "20180215", @@ -186,7 +188,7 @@ static int tidss_probe(struct platform_device *pdev) goto err_irq_uninstall; } - drm_fbdev_dma_setup(ddev, 32); + drm_client_setup(ddev, NULL); dev_dbg(dev, "%s done\n", __func__);
Re: [PATCH v2 2/4] drm/amdgpu/sdma5: implement ring reset callback for sdma5
On Thu, Sep 26, 2024 at 3:58 AM wrote: > > From: Jiadong Zhu > > Implement sdma queue reset callback via MMIO. > > v2: enter/exit safemode when sdma queue reset. > > Signed-off-by: Jiadong Zhu Reviewed-by: Alex Deucher > --- > drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c | 88 ++ > 1 file changed, 88 insertions(+) > > diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c > b/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c > index e813da1e48aa..416273d917e0 100644 > --- a/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c > +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c > @@ -1555,6 +1555,93 @@ static int sdma_v5_0_soft_reset(void *handle) > return 0; > } > > +static int sdma_v5_0_reset_queue(struct amdgpu_ring *ring, unsigned int vmid) > +{ > + struct amdgpu_device *adev = ring->adev; > + int i, j, r; > + u32 rb_cntl, ib_cntl, f32_cntl, freeze, cntl, preempt, soft_reset, > stat1_reg; > + > + if (amdgpu_sriov_vf(adev)) > + return -EINVAL; > + > + for (i = 0; i < adev->sdma.num_instances; i++) { > + if (ring == &adev->sdma.instance[i].ring) > + break; > + } > + > + if (i == adev->sdma.num_instances) { > + DRM_ERROR("sdma instance not found\n"); > + return -EINVAL; > + } > + > + amdgpu_gfx_rlc_enter_safe_mode(adev, 0); > + > + /* stop queue */ > + ib_cntl = RREG32(sdma_v5_0_get_reg_offset(adev, i, > mmSDMA0_GFX_IB_CNTL)); > + ib_cntl = REG_SET_FIELD(ib_cntl, SDMA0_GFX_IB_CNTL, IB_ENABLE, 0); > + WREG32(sdma_v5_0_get_reg_offset(adev, i, mmSDMA0_GFX_IB_CNTL), > ib_cntl); > + > + rb_cntl = RREG32(sdma_v5_0_get_reg_offset(adev, i, > mmSDMA0_GFX_RB_CNTL)); > + rb_cntl = REG_SET_FIELD(rb_cntl, SDMA0_GFX_RB_CNTL, RB_ENABLE, 0); > + WREG32(sdma_v5_0_get_reg_offset(adev, i, mmSDMA0_GFX_RB_CNTL), > rb_cntl); > + > + /* engine stop SDMA1_F32_CNTL.HALT to 1 and SDMAx_FREEZE freeze bit > to 1 */ > + freeze = RREG32(sdma_v5_0_get_reg_offset(adev, i, mmSDMA0_FREEZE)); > + freeze = REG_SET_FIELD(freeze, SDMA0_FREEZE, FREEZE, 1); > + WREG32(sdma_v5_0_get_reg_offset(adev, i, mmSDMA0_FREEZE), freeze); > + > + for (j = 0; j < adev->usec_timeout; j++) { > + freeze = RREG32(sdma_v5_0_get_reg_offset(adev, i, > mmSDMA0_FREEZE)); > + if (REG_GET_FIELD(freeze, SDMA0_FREEZE, FROZEN) & 1) > + break; > + udelay(1); > + } > + > + /* check sdma copy engine all idle if frozen not received*/ > + if (j == adev->usec_timeout) { > + stat1_reg = RREG32(sdma_v5_0_get_reg_offset(adev, i, > mmSDMA0_STATUS1_REG)); > + if ((stat1_reg & 0x3FF) != 0x3FF) { > + DRM_ERROR("cannot soft reset as sdma not idle\n"); > + r = -ETIMEDOUT; > + goto err0; > + } > + } > + > + f32_cntl = RREG32(sdma_v5_0_get_reg_offset(adev, i, > mmSDMA0_F32_CNTL)); > + f32_cntl = REG_SET_FIELD(f32_cntl, SDMA0_F32_CNTL, HALT, 1); > + WREG32(sdma_v5_0_get_reg_offset(adev, i, mmSDMA0_F32_CNTL), f32_cntl); > + > + cntl = RREG32(sdma_v5_0_get_reg_offset(adev, i, mmSDMA0_CNTL)); > + cntl = REG_SET_FIELD(cntl, SDMA0_CNTL, UTC_L1_ENABLE, 0); > + WREG32(sdma_v5_0_get_reg_offset(adev, i, mmSDMA0_CNTL), cntl); > + > + /* soft reset SDMA_GFX_PREEMPT.IB_PREEMPT = 0 > mmGRBM_SOFT_RESET.SOFT_RESET_SDMA0/1 = 1 */ > + preempt = RREG32(sdma_v5_0_get_reg_offset(adev, i, > mmSDMA0_GFX_PREEMPT)); > + preempt = REG_SET_FIELD(preempt, SDMA0_GFX_PREEMPT, IB_PREEMPT, 0); > + WREG32(sdma_v5_0_get_reg_offset(adev, i, mmSDMA0_GFX_PREEMPT), > preempt); > + > + soft_reset = RREG32_SOC15(GC, 0, mmGRBM_SOFT_RESET); > + soft_reset |= 1 << GRBM_SOFT_RESET__SOFT_RESET_SDMA0__SHIFT << i; > + > + WREG32_SOC15(GC, 0, mmGRBM_SOFT_RESET, soft_reset); > + > + udelay(50); > + > + soft_reset &= ~(1 << GRBM_SOFT_RESET__SOFT_RESET_SDMA0__SHIFT << i); > + WREG32_SOC15(GC, 0, mmGRBM_SOFT_RESET, soft_reset); > + > + /* unfreeze*/ > + freeze = RREG32(sdma_v5_0_get_reg_offset(adev, i, mmSDMA0_FREEZE)); > + freeze = REG_SET_FIELD(freeze, SDMA0_FREEZE, FREEZE, 0); > + WREG32(sdma_v5_0_get_reg_offset(adev, i, mmSDMA0_FREEZE), freeze); > + > + r = sdma_v5_0_gfx_resume_instance(adev, i, true); > + > +err0: > + amdgpu_gfx_rlc_exit_safe_mode(adev, 0); > + return r; > +} > + > static int sdma_v5_0_ring_preempt_ib(struct amdgpu_ring *ring) > { > int i, r = 0; > @@ -1897,6 +1984,7 @@ static const struct amdgpu_ring_funcs > sdma_v5_0_ring_funcs = { > .emit_reg_write_reg_wait = sdma_v5_0_ring_emit_reg_write_reg_wait, > .init_cond_exec = sdma_v5_0_ring_init_cond_exec, > .preempt_ib = sdma_v5_0_ring_preempt_ib, > + .reset = sdma_v5_0_reset_queue, > }; >
RE: [PATCH] drm/amdgpu: simplify vram alloc logic since 2GB limitation removed
[AMD Official Use Only - AMD Internal Distribution Only] -Original Message- From: Alex Deucher Sent: Thursday, September 26, 2024 1:17 AM To: Zhang, Yifan Cc: amd-gfx@lists.freedesktop.org; Yang, Philip ; Kuehling, Felix Subject: Re: [PATCH] drm/amdgpu: simplify vram alloc logic since 2GB limitation removed On Tue, Sep 24, 2024 at 10:22 AM Zhang, Yifan wrote: > > [AMD Official Use Only - AMD Internal Distribution Only] > > 2GB limitation in VRAM allocation is removed in below patch. My patch is a > follow up refine for this. The remaing_size calculation was to address the > 2GB limitation in contiguous VRAM allocation, and no longer needed after the > limitation is removed. > Thanks. Would be good to add a reference to this commit in the commit message so it's clear what you are talking about and also provide a bit more of a description about why this can be cleaned up (like you did above). One other comment below as well. Sure, I will send patch V2 to change commit message. > commit b2dba064c9bdd18c7dd39066d25453af28451dbf > Author: Philip Yang > Date: Fri Apr 19 16:27:00 2024 -0400 > > drm/amdgpu: Handle sg size limit for contiguous allocation > > Define macro AMDGPU_MAX_SG_SEGMENT_SIZE 2GB, because struct scatterlist > length is unsigned int, and some users of it cast to a signed int, so > every segment of sg table is limited to size 2GB maximum. > > For contiguous VRAM allocation, don't limit the max buddy block size in > order to get contiguous VRAM memory. To workaround the sg table segment > size limit, allocate multiple segments if contiguous size is bigger than > AMDGPU_MAX_SG_SEGMENT_SIZE. > > Signed-off-by: Philip Yang > Reviewed-by: Christian König > Signed-off-by: Alex Deucher > > > > -Original Message- > From: Alex Deucher > Sent: Tuesday, September 24, 2024 10:07 PM > To: Zhang, Yifan > Cc: amd-gfx@lists.freedesktop.org; Yang, Philip ; > Kuehling, Felix > Subject: Re: [PATCH] drm/amdgpu: simplify vram alloc logic since 2GB > limitation removed > > On Mon, Sep 23, 2024 at 4:28 AM Yifan Zhang wrote: > > > > Make vram alloc loop simpler after 2GB limitation removed. > > Can you provide more context? What 2GB limitation are you referring to? > > Alex > > > > > Signed-off-by: Yifan Zhang > > --- > > drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c | 15 +-- > > 1 file changed, 5 insertions(+), 10 deletions(-) > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c > > index 7d26a962f811..3d129fd61fa7 100644 > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c > > @@ -455,7 +455,7 @@ static int amdgpu_vram_mgr_new(struct > > ttm_resource_manager *man, > > struct amdgpu_bo *bo = ttm_to_amdgpu_bo(tbo); > > u64 vis_usage = 0, max_bytes, min_block_size; > > struct amdgpu_vram_mgr_resource *vres; > > - u64 size, remaining_size, lpfn, fpfn; > > + u64 size, lpfn, fpfn; > > unsigned int adjust_dcc_size = 0; > > struct drm_buddy *mm = &mgr->mm; > > struct drm_buddy_block *block; @@ -516,25 +516,23 @@ static > > int amdgpu_vram_mgr_new(struct ttm_resource_manager *man, > > adev->gmc.gmc_funcs->get_dcc_alignment) > > adjust_dcc_size = > > amdgpu_gmc_get_dcc_alignment(adev); > > > > - remaining_size = (u64)vres->base.size; > > + size = (u64)vres->base.size; > > if (bo->flags & AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS && > > adjust_dcc_size) { > > unsigned int dcc_size; > > > > dcc_size = roundup_pow_of_two(vres->base.size + > > adjust_dcc_size); > > - remaining_size = (u64)dcc_size; > > + size = (u64)dcc_size; > > > > vres->flags |= DRM_BUDDY_TRIM_DISABLE; > > } > > > > mutex_lock(&mgr->lock); > > - while (remaining_size) { > > + while (true) { I think you can also drop this while loop now too. Alex There is still a "continue" path left in the loop, I think the logic may be broken here if loop dropped. r = drm_buddy_alloc_blocks(mm, fpfn, lpfn, size, min_block_size, &vres->blocks, vres->flags); if (unlikely(r == -ENOSPC) && pages_per_block == ~0ul && !(place->flags & TTM_PL_FLAG_CONTIGUOUS)) { vres->flags &= ~DRM_BUDDY_CONTIGUOUS_ALLOCATION; pages_per_block = max_t(u32, 2UL << (20UL - PAGE_SHIFT), tbo->page_alignment); continue; /* continue in the loop */ } > > if (tbo->page_alignment) > > min_block_size = (u64)tbo->page_alignment << > > PAGE_SHIFT; > > else > > min_block_size = mgr->d
Re: [PATCH v1 5/9] drm/amd/pm: use pm_runtime_get_if_active for debugfs getters
On 9/26/2024 5:56 PM, Pierre-Eric Pelloux-Prayer wrote: > > > Le 26/09/2024 à 11:27, Lazar, Lijo a écrit : >> >> >> On 9/25/2024 1:24 PM, Pierre-Eric Pelloux-Prayer wrote: >>> Don't wake up the GPU for reading pm values. Instead, take a runtime >>> powermanagement ref when trying to read it iff the GPU is already >>> awake. >>> >>> This avoids spurious wake ups (eg: from applets). >>> >>> We use pm_runtime_get_if_in_active(ign_usage_count=true) because we care >>> about "is the GPU awake?" not about "is the GPU awake and something else >>> prevents suspend?". >> >> One possible downside of this approach is - let's say an application >> tries this way on a BACO enabled device (device is visible on bus, but >> powered off due to runtime PM) >> >> Get current clock level >> If (success) && (not desired clock level) >> Set clock level >> Submit some jobs to run at set clock level >> >> This type of approach won't work since get clock level() itself will >> fail. That said, I don't know if there is any app out there does >> something like this. > > Yes, this would break this pattern but I don't see why it would be used. > > If this theoretical concern becomes real because an application reports > a breakage after this series is merged, we can figure out how to best > deal with it. > Right, will see if Alex/Kenneth has any comments. That apart, the series looks good to me. Thanks, Lijo > Pierr-Eric > > >> >> Thanks, >> Lijo >> >>> >>> Tested-by: Mario Limonciello >>> Signed-off-by: Pierre-Eric Pelloux-Prayer >>> >>> --- >>> drivers/gpu/drm/amd/pm/amdgpu_pm.c | 138 ++--- >>> 1 file changed, 69 insertions(+), 69 deletions(-) >>> >>> diff --git a/drivers/gpu/drm/amd/pm/amdgpu_pm.c >>> b/drivers/gpu/drm/amd/pm/amdgpu_pm.c >>> index c8f34b1a4d8e..f1f339b75380 100644 >>> --- a/drivers/gpu/drm/amd/pm/amdgpu_pm.c >>> +++ b/drivers/gpu/drm/amd/pm/amdgpu_pm.c >>> @@ -145,9 +145,9 @@ static ssize_t amdgpu_get_power_dpm_state(struct >>> device *dev, >>> if (adev->in_suspend && !adev->in_runpm) >>> return -EPERM; >>> - ret = pm_runtime_resume_and_get(ddev->dev); >>> - if (ret < 0) >>> - return ret; >>> + ret = pm_runtime_get_if_active(ddev->dev, true); >>> + if (ret <= 0) >>> + return ret ?: -EPERM; >>> amdgpu_dpm_get_current_power_state(adev, &pm); >>> @@ -268,9 +268,9 @@ static ssize_t >>> amdgpu_get_power_dpm_force_performance_level(struct device *dev, >>> if (adev->in_suspend && !adev->in_runpm) >>> return -EPERM; >>> - ret = pm_runtime_resume_and_get(ddev->dev); >>> - if (ret < 0) >>> - return ret; >>> + ret = pm_runtime_get_if_active(ddev->dev, true); >>> + if (ret <= 0) >>> + return ret ?: -EPERM; >>> level = amdgpu_dpm_get_performance_level(adev); >>> @@ -364,9 +364,9 @@ static ssize_t amdgpu_get_pp_num_states(struct >>> device *dev, >>> if (adev->in_suspend && !adev->in_runpm) >>> return -EPERM; >>> - ret = pm_runtime_resume_and_get(ddev->dev); >>> - if (ret < 0) >>> - return ret; >>> + ret = pm_runtime_get_if_active(ddev->dev, true); >>> + if (ret <= 0) >>> + return ret ?: -EPERM; >>> if (amdgpu_dpm_get_pp_num_states(adev, &data)) >>> memset(&data, 0, sizeof(data)); >>> @@ -399,9 +399,9 @@ static ssize_t amdgpu_get_pp_cur_state(struct >>> device *dev, >>> if (adev->in_suspend && !adev->in_runpm) >>> return -EPERM; >>> - ret = pm_runtime_resume_and_get(ddev->dev); >>> - if (ret < 0) >>> - return ret; >>> + ret = pm_runtime_get_if_active(ddev->dev, true); >>> + if (ret <= 0) >>> + return ret ?: -EPERM; >>> amdgpu_dpm_get_current_power_state(adev, &pm); >>> @@ -526,9 +526,9 @@ static ssize_t amdgpu_get_pp_table(struct >>> device *dev, >>> if (adev->in_suspend && !adev->in_runpm) >>> return -EPERM; >>> - ret = pm_runtime_resume_and_get(ddev->dev); >>> - if (ret < 0) >>> - return ret; >>> + ret = pm_runtime_get_if_active(ddev->dev, true); >>> + if (ret <= 0) >>> + return ret ?: -EPERM; >>> size = amdgpu_dpm_get_pp_table(adev, &table); >>> @@ -840,9 +840,9 @@ static ssize_t >>> amdgpu_get_pp_od_clk_voltage(struct device *dev, >>> if (adev->in_suspend && !adev->in_runpm) >>> return -EPERM; >>> - ret = pm_runtime_resume_and_get(ddev->dev); >>> - if (ret < 0) >>> - return ret; >>> + ret = pm_runtime_get_if_active(ddev->dev, true); >>> + if (ret <= 0) >>> + return ret ?: -EPERM; >>> for (clk_index = 0 ; clk_index < 6 ; clk_index++) { >>> ret = amdgpu_dpm_emit_clock_levels(adev, >>> od_clocks[clk_index], buf, &size); >>> @@ -930,9 +930,9 @@ static ssize_t amdgpu_get_pp_features(struct >>> device *dev, >>> if (adev->in_suspend && !adev->in_runpm) >>> return -EPERM; >>> - ret = pm_runtime_resume_and_get(dde
Re: [PATCH] drm/amdgpu: fix vbios fetching for SR-IOV
On Thu, Sep 26, 2024 at 7:18 AM Lazar, Lijo wrote: > > > > On 9/25/2024 11:50 PM, Alex Deucher wrote: > > SR-IOV fetches the vbios from VRAM in some cases. > > Re-enable the VRAM path for dGPUs and rename the function > > to make it clear that it is not IGP specific. > > > > Fixes: 042658d17a54 ("drm/amdgpu: clean up vbios fetching code") > > Signed-off-by: Alex Deucher > > --- > > drivers/gpu/drm/amd/amdgpu/amdgpu_bios.c | 11 +-- > > 1 file changed, 9 insertions(+), 2 deletions(-) > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_bios.c > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_bios.c > > index 46bf623919d7c..45affc02548c1 100644 > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_bios.c > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_bios.c > > @@ -87,8 +87,9 @@ static bool check_atom_bios(uint8_t *bios, size_t size) > > * part of the system bios. On boot, the system bios puts a > > * copy of the igp rom at the start of vram if a discrete card is > > * present. > > + * For SR-IOV, the vbios image is also put in VRAM in the VF. > > */ > > -static bool igp_read_bios_from_vram(struct amdgpu_device *adev) > > +static bool amdgpu_read_bios_from_vram(struct amdgpu_device *adev) > > { > > uint8_t __iomem *bios; > > resource_size_t vram_base; > > @@ -414,7 +415,7 @@ static bool amdgpu_get_bios_apu(struct amdgpu_device > > *adev) > > goto success; > > } > > > > - if (igp_read_bios_from_vram(adev)) { > > + if (amdgpu_read_bios_from_vram(adev)) { > > dev_info(adev->dev, "Fetched VBIOS from VRAM BAR\n"); > > goto success; > > } > > @@ -448,6 +449,12 @@ static bool amdgpu_get_bios_dgpu(struct amdgpu_device > > *adev) > > goto success; > > } > > > > + /* this is required for SR-IOV */ > > + if (amdgpu_read_bios_from_vram(adev)) { > > If this is only expected for VFs, then it's better to add > amdgpu_sriov_vf(adev) check for this call. I think that is the case, but I'm not 100% sure. I can ask the vbios team. Alex > > Thanks, > Lijo > > > + dev_info(adev->dev, "Fetched VBIOS from VRAM BAR\n"); > > + goto success; > > + } > > + > > if (amdgpu_read_platform_bios(adev)) { > > dev_info(adev->dev, "Fetched VBIOS from platform\n"); > > goto success;
Re: [PATCH v2 4/4] drm/amdgpu/sdma5.2: implement ring reset callback for sdma5.2
On Thu, Sep 26, 2024 at 3:48 AM wrote: > > From: Jiadong Zhu > > Implement sdma queue reset callback via MMIO. > > v2: enter/exit safemode for mmio queue reset. > > Signed-off-by: Jiadong Zhu Reviewed-by: Alex Deucher > --- > drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c | 91 ++ > 1 file changed, 91 insertions(+) > > diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c > b/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c > index 37534c6f431f..8e8f8be01539 100644 > --- a/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c > +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c > @@ -1424,6 +1424,96 @@ static int sdma_v5_2_wait_for_idle(void *handle) > return -ETIMEDOUT; > } > > +static int sdma_v5_2_reset_queue(struct amdgpu_ring *ring, unsigned int vmid) > +{ > + struct amdgpu_device *adev = ring->adev; > + int i, j, r; > + u32 rb_cntl, ib_cntl, f32_cntl, freeze, cntl, preempt, soft_reset, > stat1_reg; > + > + if (amdgpu_sriov_vf(adev)) > + return -EINVAL; > + > + for (i = 0; i < adev->sdma.num_instances; i++) { > + if (ring == &adev->sdma.instance[i].ring) > + break; > + } > + > + if (i == adev->sdma.num_instances) { > + DRM_ERROR("sdma instance not found\n"); > + return -EINVAL; > + } > + > + amdgpu_gfx_rlc_enter_safe_mode(adev, 0); > + > + /* stop queue */ > + ib_cntl = RREG32(sdma_v5_2_get_reg_offset(adev, i, > mmSDMA0_GFX_IB_CNTL)); > + ib_cntl = REG_SET_FIELD(ib_cntl, SDMA0_GFX_IB_CNTL, IB_ENABLE, 0); > + WREG32(sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_GFX_IB_CNTL), > ib_cntl); > + > + rb_cntl = RREG32(sdma_v5_2_get_reg_offset(adev, i, > mmSDMA0_GFX_RB_CNTL)); > + rb_cntl = REG_SET_FIELD(rb_cntl, SDMA0_GFX_RB_CNTL, RB_ENABLE, 0); > + WREG32(sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_GFX_RB_CNTL), > rb_cntl); > + > + /*engine stop SDMA1_F32_CNTL.HALT to 1 and SDMAx_FREEZE freeze bit to > 1 */ > + freeze = RREG32(sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_FREEZE)); > + freeze = REG_SET_FIELD(freeze, SDMA0_FREEZE, FREEZE, 1); > + WREG32(sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_FREEZE), freeze); > + > + for (j = 0; j < adev->usec_timeout; j++) { > + freeze = RREG32(sdma_v5_2_get_reg_offset(adev, i, > mmSDMA0_FREEZE)); > + > + if (REG_GET_FIELD(freeze, SDMA0_FREEZE, FROZEN) & 1) > + break; > + udelay(1); > + } > + > + > + if (j == adev->usec_timeout) { > + stat1_reg = RREG32(sdma_v5_2_get_reg_offset(adev, i, > mmSDMA0_STATUS1_REG)); > + if ((stat1_reg & 0x3FF) != 0x3FF) { > + DRM_ERROR("cannot soft reset as sdma not idle\n"); > + r = -ETIMEDOUT; > + goto err0; > + } > + } > + > + f32_cntl = RREG32(sdma_v5_2_get_reg_offset(adev, i, > mmSDMA0_F32_CNTL)); > + f32_cntl = REG_SET_FIELD(f32_cntl, SDMA0_F32_CNTL, HALT, 1); > + WREG32(sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_F32_CNTL), f32_cntl); > + > + cntl = RREG32(sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_CNTL)); > + cntl = REG_SET_FIELD(cntl, SDMA0_CNTL, UTC_L1_ENABLE, 0); > + WREG32(sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_CNTL), cntl); > + > + /* soft reset SDMA_GFX_PREEMPT.IB_PREEMPT = 0 > mmGRBM_SOFT_RESET.SOFT_RESET_SDMA0/1 = 1 */ > + preempt = RREG32(sdma_v5_2_get_reg_offset(adev, i, > mmSDMA0_GFX_PREEMPT)); > + preempt = REG_SET_FIELD(preempt, SDMA0_GFX_PREEMPT, IB_PREEMPT, 0); > + WREG32(sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_GFX_PREEMPT), > preempt); > + > + soft_reset = RREG32_SOC15(GC, 0, mmGRBM_SOFT_RESET); > + soft_reset |= 1 << GRBM_SOFT_RESET__SOFT_RESET_SDMA0__SHIFT << i; > + > + > + WREG32_SOC15(GC, 0, mmGRBM_SOFT_RESET, soft_reset); > + > + udelay(50); > + > + soft_reset &= ~(1 << GRBM_SOFT_RESET__SOFT_RESET_SDMA0__SHIFT << i); > + > + WREG32_SOC15(GC, 0, mmGRBM_SOFT_RESET, soft_reset); > + > + /* unfreeze and unhalt */ > + freeze = RREG32(sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_FREEZE)); > + freeze = REG_SET_FIELD(freeze, SDMA0_FREEZE, FREEZE, 0); > + WREG32(sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_FREEZE), freeze); > + > + r = sdma_v5_2_gfx_resume_instance(adev, i, true); > + > +err0: > + amdgpu_gfx_rlc_exit_safe_mode(adev, 0); > + return r; > +} > + > static int sdma_v5_2_ring_preempt_ib(struct amdgpu_ring *ring) > { > int i, r = 0; > @@ -1859,6 +1949,7 @@ static const struct amdgpu_ring_funcs > sdma_v5_2_ring_funcs = { > .emit_reg_write_reg_wait = sdma_v5_2_ring_emit_reg_write_reg_wait, > .init_cond_exec = sdma_v5_2_ring_init_cond_exec, > .preempt_ib = sdma_v5_2_ring_preempt_ib, > + .reset = sdma_v5_2_reset_queue, > }; > > static void sdma_v5_2_set_ri
RE: [PATCH v3] drm/amd/display: Fix out-of-bounds access in 'dcn21_link_encoder_create'
[Public] Reviewed-by: Roman Li > -Original Message- > From: SHANMUGAM, SRINIVASAN > Sent: Wednesday, September 25, 2024 11:10 AM > To: Siqueira, Rodrigo ; Pillai, Aurabindo > > Cc: amd-gfx@lists.freedesktop.org; SHANMUGAM, SRINIVASAN > ; Chung, ChiaHsuan (Tom) > ; Li, Roman ; Hung, Alex > ; Wentland, Harry ; Mahfooz, > Hamza > Subject: [PATCH v3] drm/amd/display: Fix out-of-bounds access in > 'dcn21_link_encoder_create' > > An issue was identified in the dcn21_link_encoder_create function where an > out-of- > bounds access could occur when the hpd_source index was used to reference the > link_enc_hpd_regs array. This array has a fixed size and the index was not > being > checked against the array's bounds before accessing it. > > This fix adds a conditional check to ensure that the hpd_source index is > within the > valid range of the link_enc_hpd_regs array. If the index is out of bounds, > the function > now returns NULL to prevent undefined behavior. > > References: > > [ 65.920507] [ cut here ] > [ 65.920510] UBSAN: array-index-out-of-bounds in > drivers/gpu/drm/amd/amdgpu/../display/dc/resource/dcn21/dcn21_resource.c:1312: > 29 > [ 65.920519] index 7 is out of range for type 'dcn10_link_enc_hpd_registers > [5]' > [ 65.920523] CPU: 3 PID: 1178 Comm: modprobe Tainted: G OE > 6.8.0- > cleanershaderfeatureresetasdntipmi200nv2132 #13 > [ 65.920525] Hardware name: AMD Majolica-RN/Majolica-RN, BIOS > WMJ0429N_Weekly_20_04_2 04/29/2020 > [ 65.920527] Call Trace: > [ 65.920529] > [ 65.920532] dump_stack_lvl+0x48/0x70 > [ 65.920541] dump_stack+0x10/0x20 > [ 65.920543] __ubsan_handle_out_of_bounds+0xa2/0xe0 > [ 65.920549] dcn21_link_encoder_create+0xd9/0x140 [amdgpu] > [ 65.921009] link_create+0x6d3/0xed0 [amdgpu] > [ 65.921355] create_links+0x18a/0x4e0 [amdgpu] > [ 65.921679] dc_create+0x360/0x720 [amdgpu] > [ 65.921999] ? dmi_matches+0xa0/0x220 > [ 65.922004] amdgpu_dm_init+0x2b6/0x2c90 [amdgpu] > [ 65.922342] ? console_unlock+0x77/0x120 > [ 65.922348] ? dev_printk_emit+0x86/0xb0 > [ 65.922354] dm_hw_init+0x15/0x40 [amdgpu] > [ 65.922686] amdgpu_device_init+0x26a8/0x33a0 [amdgpu] > [ 65.922921] amdgpu_driver_load_kms+0x1b/0xa0 [amdgpu] > [ 65.923087] amdgpu_pci_probe+0x1b7/0x630 [amdgpu] > [ 65.923087] local_pci_probe+0x4b/0xb0 > [ 65.923087] pci_device_probe+0xc8/0x280 > [ 65.923087] really_probe+0x187/0x300 > [ 65.923087] __driver_probe_device+0x85/0x130 > [ 65.923087] driver_probe_device+0x24/0x110 > [ 65.923087] __driver_attach+0xac/0x1d0 > [ 65.923087] ? __pfx___driver_attach+0x10/0x10 > [ 65.923087] bus_for_each_dev+0x7d/0xd0 > [ 65.923087] driver_attach+0x1e/0x30 > [ 65.923087] bus_add_driver+0xf2/0x200 > [ 65.923087] driver_register+0x64/0x130 > [ 65.923087] ? __pfx_amdgpu_init+0x10/0x10 [amdgpu] > [ 65.923087] __pci_register_driver+0x61/0x70 > [ 65.923087] amdgpu_init+0x7d/0xff0 [amdgpu] > [ 65.923087] do_one_initcall+0x49/0x310 > [ 65.923087] ? kmalloc_trace+0x136/0x360 > [ 65.923087] do_init_module+0x6a/0x270 > [ 65.923087] load_module+0x1fce/0x23a0 > [ 65.923087] init_module_from_file+0x9c/0xe0 > [ 65.923087] ? init_module_from_file+0x9c/0xe0 > [ 65.923087] idempotent_init_module+0x179/0x230 > [ 65.923087] __x64_sys_finit_module+0x5d/0xa0 > [ 65.923087] do_syscall_64+0x76/0x120 > [ 65.923087] entry_SYSCALL_64_after_hwframe+0x6e/0x76 > [ 65.923087] RIP: 0033:0x7f2d80f1e88d > [ 65.923087] Code: 5b 41 5c c3 66 0f 1f 84 00 00 00 00 00 f3 0f 1e fa 48 89 > f8 48 > 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 > ff ff > 73 01 c3 48 8b 0d 73 b5 0f 00 f7 d8 64 89 01 48 > [ 65.923087] RSP: 002b:7ffc7bc1aa78 EFLAGS: 0246 ORIG_RAX: > 0139 > [ 65.923087] RAX: ffda RBX: 564c9c1db130 RCX: > 7f2d80f1e88d > [ 65.923087] RDX: RSI: 564c9c1e5480 RDI: > 000f > [ 65.923087] RBP: 0004 R08: R09: > 0002 > [ 65.923087] R10: 000f R11: 0246 R12: > 564c9c1e5480 > [ 65.923087] R13: 564c9c1db260 R14: R15: > 564c9c1e54b0 > [ 65.923087] > [ 65.923927] ---[ end trace ]--- > > Cc: Tom Chung > Cc: Rodrigo Siqueira > Cc: Roman Li > Cc: Alex Hung > Cc: Aurabindo Pillai > Cc: Harry Wentland > Cc: Hamza Mahfooz > Signed-off-by: Srinivasan Shanmugam > --- > v2: > - Changed to ARRAY_SIZE() to be generic (Roman) > > v3: > - Updated git commit message > > drivers/gpu/drm/amd/display/dc/resource/dcn21/dcn21_resource.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/amd/display/dc/resource/dcn21/dcn21_resource.c > b/drivers/gpu/drm/amd/display/dc/resource/dcn21/dcn21_resource.c > index 347e6aaea582..14b28841657d 100644 > --- a/drivers/gpu/drm/amd/display/dc/resource/dcn21/dcn21_resourc
Re: [PATCH] drm/amdgpu: bump driver version for cleared VRAM
Reviewed-by: Marek Olšák Marek On Thu, Sep 26, 2024 at 10:02 AM Alex Deucher wrote: > > On Thu, Sep 26, 2024 at 10:00 AM Marek Olšák wrote: > > > > Is GTT cleared too? > > GTT has always been cleared since it's system memory. > > Alex > > > > > > Marek > > > > On Thu, Sep 26, 2024, 09:53 Alex Deucher wrote: > >> > >> Ping? > >> > >> On Fri, Sep 6, 2024 at 2:04 PM Alex Deucher > >> wrote: > >> > > >> > Driver now clears VRAM on allocation. Bump the > >> > driver version so mesa knows when it will get > >> > cleared vram by default. > >> > > >> > Signed-off-by: Alex Deucher > >> > --- > >> > drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 3 ++- > >> > 1 file changed, 2 insertions(+), 1 deletion(-) > >> > > >> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c > >> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c > >> > index d90473aec942..fad556be840b 100644 > >> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c > >> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c > >> > @@ -117,9 +117,10 @@ > >> > * - 3.56.0 - Update IB start address and size alignment for decode and > >> > encode > >> > * - 3.57.0 - Compute tunneling on GFX10+ > >> > * - 3.58.0 - Add GFX12 DCC support > >> > + * - 3.59.0 - Cleared VRAM > >> > */ > >> > #define KMS_DRIVER_MAJOR 3 > >> > -#define KMS_DRIVER_MINOR 58 > >> > +#define KMS_DRIVER_MINOR 59 > >> > #define KMS_DRIVER_PATCHLEVEL 0 > >> > > >> > /* > >> > -- > >> > 2.46.0 > >> >
Re: [PATCH v2 03/08] drm/amdgpu: Add wait IOCTL timeline syncobj support
Am 25.09.24 um 21:59 schrieb Arunpravin Paneer Selvam: Add user fence wait IOCTL timeline syncobj support. v2:(Christian) - handle dma_fence_wait() return value. - shorten the variable name syncobj_timeline_points a bit. - move num_points up to avoid padding issues. Signed-off-by: Arunpravin Paneer Selvam --- .../gpu/drm/amd/amdgpu/amdgpu_userq_fence.c | 82 +-- include/uapi/drm/amdgpu_drm.h | 16 +++- 2 files changed, 88 insertions(+), 10 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_userq_fence.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_userq_fence.c index 97b1af574407..2465ca307644 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_userq_fence.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_userq_fence.c @@ -474,11 +474,11 @@ int amdgpu_userq_signal_ioctl(struct drm_device *dev, void *data, int amdgpu_userq_wait_ioctl(struct drm_device *dev, void *data, struct drm_file *filp) { + u32 *syncobj_handles, *timeline_points, *timeline_handles, *bo_handles; + u32 num_syncobj, num_bo_handles, num_points; struct drm_amdgpu_userq_fence_info *fence_info = NULL; struct drm_amdgpu_userq_wait *wait_info = data; - u32 *syncobj_handles, *bo_handles; struct dma_fence **fences = NULL; - u32 num_syncobj, num_bo_handles; struct drm_gem_object **gobj; struct drm_exec exec; int r, i, entry, cnt; @@ -498,11 +498,26 @@ int amdgpu_userq_wait_ioctl(struct drm_device *dev, void *data, goto free_bo_handles; } + num_points = wait_info->num_points; + timeline_handles = memdup_user(u64_to_user_ptr(wait_info->syncobj_timeline_handles), + sizeof(u32) * num_points); + if (IS_ERR(timeline_handles)) { + r = PTR_ERR(timeline_handles); + goto free_syncobj_handles; + } + + timeline_points = memdup_user(u64_to_user_ptr(wait_info->syncobj_timeline_points), + sizeof(u32) * num_points); + if (IS_ERR(timeline_points)) { + r = PTR_ERR(timeline_points); + goto free_timeline_handles; + } + /* Array of GEM object handles */ gobj = kmalloc_array(num_bo_handles, sizeof(*gobj), GFP_KERNEL); if (!gobj) { r = -ENOMEM; - goto free_syncobj_handles; + goto free_timeline_points; } for (entry = 0; entry < num_bo_handles; entry++) { @@ -524,17 +539,34 @@ int amdgpu_userq_wait_ioctl(struct drm_device *dev, void *data, } if (!wait_info->num_fences) { + if (num_points) { + struct dma_fence *fence; ++ + for (i = 0; i < num_points; i++) { + r = drm_syncobj_find_fence(filp, timeline_handles[i], + timeline_points[i], + DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT, + &fence); + if (r || !fence) You can't simply ignore errors here. That needs some goto error handling and cleanup. + continue; + + dma_fence_put(fence); + num_fences++; There can be more than one fence in the timeline we need to wait for. You should probably use dma_fence_unwrap_for_each() here. + } + } + /* Count syncobj's fence */ for (i = 0; i < num_syncobj; i++) { struct dma_fence *fence; r = drm_syncobj_find_fence(filp, syncobj_handles[i], - 0, 0, &fence); - dma_fence_put(fence); - + 0, + DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT, + &fence); if (r || !fence) continue; + dma_fence_put(fence); num_fences++; } @@ -589,12 +621,34 @@ int amdgpu_userq_wait_ioctl(struct drm_device *dev, void *data, } } + if (num_points) { + struct dma_fence *fence; + + for (i = 0; i < num_points; i++) { + r = drm_syncobj_find_fence(filp, timeline_handles[i], + timeline_points[i], + DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT, + &fence); + if (r || !fence) Same here,
Re: [PATCH v1 5/9] drm/amd/pm: use pm_runtime_get_if_active for debugfs getters
Le 26/09/2024 à 11:27, Lazar, Lijo a écrit : On 9/25/2024 1:24 PM, Pierre-Eric Pelloux-Prayer wrote: Don't wake up the GPU for reading pm values. Instead, take a runtime powermanagement ref when trying to read it iff the GPU is already awake. This avoids spurious wake ups (eg: from applets). We use pm_runtime_get_if_in_active(ign_usage_count=true) because we care about "is the GPU awake?" not about "is the GPU awake and something else prevents suspend?". One possible downside of this approach is - let's say an application tries this way on a BACO enabled device (device is visible on bus, but powered off due to runtime PM) Get current clock level If (success) && (not desired clock level) Set clock level Submit some jobs to run at set clock level This type of approach won't work since get clock level() itself will fail. That said, I don't know if there is any app out there does something like this. Yes, this would break this pattern but I don't see why it would be used. If this theoretical concern becomes real because an application reports a breakage after this series is merged, we can figure out how to best deal with it. Pierr-Eric Thanks, Lijo Tested-by: Mario Limonciello Signed-off-by: Pierre-Eric Pelloux-Prayer --- drivers/gpu/drm/amd/pm/amdgpu_pm.c | 138 ++--- 1 file changed, 69 insertions(+), 69 deletions(-) diff --git a/drivers/gpu/drm/amd/pm/amdgpu_pm.c b/drivers/gpu/drm/amd/pm/amdgpu_pm.c index c8f34b1a4d8e..f1f339b75380 100644 --- a/drivers/gpu/drm/amd/pm/amdgpu_pm.c +++ b/drivers/gpu/drm/amd/pm/amdgpu_pm.c @@ -145,9 +145,9 @@ static ssize_t amdgpu_get_power_dpm_state(struct device *dev, if (adev->in_suspend && !adev->in_runpm) return -EPERM; - ret = pm_runtime_resume_and_get(ddev->dev); - if (ret < 0) - return ret; + ret = pm_runtime_get_if_active(ddev->dev, true); + if (ret <= 0) + return ret ?: -EPERM; amdgpu_dpm_get_current_power_state(adev, &pm); @@ -268,9 +268,9 @@ static ssize_t amdgpu_get_power_dpm_force_performance_level(struct device *dev, if (adev->in_suspend && !adev->in_runpm) return -EPERM; - ret = pm_runtime_resume_and_get(ddev->dev); - if (ret < 0) - return ret; + ret = pm_runtime_get_if_active(ddev->dev, true); + if (ret <= 0) + return ret ?: -EPERM; level = amdgpu_dpm_get_performance_level(adev); @@ -364,9 +364,9 @@ static ssize_t amdgpu_get_pp_num_states(struct device *dev, if (adev->in_suspend && !adev->in_runpm) return -EPERM; - ret = pm_runtime_resume_and_get(ddev->dev); - if (ret < 0) - return ret; + ret = pm_runtime_get_if_active(ddev->dev, true); + if (ret <= 0) + return ret ?: -EPERM; if (amdgpu_dpm_get_pp_num_states(adev, &data)) memset(&data, 0, sizeof(data)); @@ -399,9 +399,9 @@ static ssize_t amdgpu_get_pp_cur_state(struct device *dev, if (adev->in_suspend && !adev->in_runpm) return -EPERM; - ret = pm_runtime_resume_and_get(ddev->dev); - if (ret < 0) - return ret; + ret = pm_runtime_get_if_active(ddev->dev, true); + if (ret <= 0) + return ret ?: -EPERM; amdgpu_dpm_get_current_power_state(adev, &pm); @@ -526,9 +526,9 @@ static ssize_t amdgpu_get_pp_table(struct device *dev, if (adev->in_suspend && !adev->in_runpm) return -EPERM; - ret = pm_runtime_resume_and_get(ddev->dev); - if (ret < 0) - return ret; + ret = pm_runtime_get_if_active(ddev->dev, true); + if (ret <= 0) + return ret ?: -EPERM; size = amdgpu_dpm_get_pp_table(adev, &table); @@ -840,9 +840,9 @@ static ssize_t amdgpu_get_pp_od_clk_voltage(struct device *dev, if (adev->in_suspend && !adev->in_runpm) return -EPERM; - ret = pm_runtime_resume_and_get(ddev->dev); - if (ret < 0) - return ret; + ret = pm_runtime_get_if_active(ddev->dev, true); + if (ret <= 0) + return ret ?: -EPERM; for (clk_index = 0 ; clk_index < 6 ; clk_index++) { ret = amdgpu_dpm_emit_clock_levels(adev, od_clocks[clk_index], buf, &size); @@ -930,9 +930,9 @@ static ssize_t amdgpu_get_pp_features(struct device *dev, if (adev->in_suspend && !adev->in_runpm) return -EPERM; - ret = pm_runtime_resume_and_get(ddev->dev); - if (ret < 0) - return ret; + ret = pm_runtime_get_if_active(ddev->dev, true); + if (ret <= 0) + return ret ?: -EPERM; size = amdgpu_dpm_get_ppfeature_status(adev, buf); if (size <= 0) @@ -996,9 +996,9 @@ static ssize_t amdgpu_get_pp_dpm_clock(struct device *dev, if (adev->in_suspend && !adev->in_runpm)
Re: [PATCH v2 06/08] drm/amdgpu: Add few optimizations to userq fence driver
Am 25.09.24 um 21:59 schrieb Arunpravin Paneer Selvam: Add few optimizations to userq fence driver. "Few optimization and fixes for userq fence driver". v1:(Christian): - Remove unnecessary comments. - In drm_exec_init call give num_bo_handles as last parameter it would making allocation of the array more efficient - Handle return value of __xa_store() and improve the error handling of amdgpu_userq_fence_driver_alloc(). v2:(Christian): - Revert userq_xa xarray init to XA_FLAGS_LOCK_IRQ. - move the xa_unlock before the error check of the call xa_err(__xa_store()) and moved this change to a separate patch as this is adding a missing error handling. - Removed the unnecessary comments. Signed-off-by: Arunpravin Paneer Selvam Reviewed-by: Christian König --- .../gpu/drm/amd/amdgpu/amdgpu_userq_fence.c | 44 --- .../gpu/drm/amd/amdgpu/amdgpu_userq_fence.h | 2 +- drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c | 6 +-- .../gpu/drm/amd/include/amdgpu_userqueue.h| 2 +- 4 files changed, 32 insertions(+), 22 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_userq_fence.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_userq_fence.c index 8d2a0331cd96..f3576c31428c 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_userq_fence.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_userq_fence.c @@ -76,7 +76,8 @@ int amdgpu_userq_fence_driver_alloc(struct amdgpu_device *adev, fence_drv = kzalloc(sizeof(*fence_drv), GFP_KERNEL); if (!fence_drv) { DRM_ERROR("Failed to allocate memory for fence driver\n"); - return -ENOMEM; + r = -ENOMEM; + goto free_fence_drv; } /* Acquire seq64 memory */ @@ -84,7 +85,8 @@ int amdgpu_userq_fence_driver_alloc(struct amdgpu_device *adev, &fence_drv->cpu_addr); if (r) { kfree(fence_drv); - return -ENOMEM; + r = -ENOMEM; + goto free_seq64; } memset(fence_drv->cpu_addr, 0, sizeof(u64)); @@ -94,7 +96,7 @@ int amdgpu_userq_fence_driver_alloc(struct amdgpu_device *adev, spin_lock_init(&fence_drv->fence_list_lock); fence_drv->adev = adev; - fence_drv->uq_fence_drv_xa_ref = &userq->uq_fence_drv_xa; + fence_drv->fence_drv_xa_ptr = &userq->fence_drv_xa; fence_drv->context = dma_fence_context_alloc(1); get_task_comm(fence_drv->timeline_name, current); @@ -106,6 +108,13 @@ int amdgpu_userq_fence_driver_alloc(struct amdgpu_device *adev, userq->fence_drv = fence_drv; return 0; + +free_seq64: + amdgpu_seq64_free(adev, fence_drv->gpu_addr); +free_fence_drv: + kfree(fence_drv); + + return r; } void amdgpu_userq_fence_driver_process(struct amdgpu_userq_fence_driver *fence_drv) @@ -147,7 +156,7 @@ void amdgpu_userq_fence_driver_destroy(struct kref *ref) struct amdgpu_device *adev = fence_drv->adev; struct amdgpu_userq_fence *fence, *tmp; struct xarray *xa = &adev->userq_xa; - unsigned long index; + unsigned long index, flags; struct dma_fence *f; spin_lock(&fence_drv->fence_list_lock); @@ -164,11 +173,11 @@ void amdgpu_userq_fence_driver_destroy(struct kref *ref) } spin_unlock(&fence_drv->fence_list_lock); - xa_lock(xa); + xa_lock_irqsave(xa, flags); xa_for_each(xa, index, xa_fence_drv) if (xa_fence_drv == fence_drv) __xa_erase(xa, index); - xa_unlock(xa); + xa_unlock_irqrestore(xa, flags); /* Free seq64 memory */ amdgpu_seq64_free(adev, fence_drv->gpu_addr); @@ -212,12 +221,12 @@ int amdgpu_userq_fence_create(struct amdgpu_usermode_queue *userq, amdgpu_userq_fence_driver_get(fence_drv); dma_fence_get(fence); - if (!xa_empty(&userq->uq_fence_drv_xa)) { + if (!xa_empty(&userq->fence_drv_xa)) { struct amdgpu_userq_fence_driver *stored_fence_drv; unsigned long index, count = 0; int i = 0; - xa_for_each(&userq->uq_fence_drv_xa, index, stored_fence_drv) + xa_for_each(&userq->fence_drv_xa, index, stored_fence_drv) count++; userq_fence->fence_drv_array = @@ -226,9 +235,9 @@ int amdgpu_userq_fence_create(struct amdgpu_usermode_queue *userq, GFP_KERNEL); if (userq_fence->fence_drv_array) { - xa_for_each(&userq->uq_fence_drv_xa, index, stored_fence_drv) { + xa_for_each(&userq->fence_drv_xa, index, stored_fence_drv) { userq_fence->fence_drv_array[i] = stored_fence_drv; - xa_erase(&userq->uq_fence_drv_xa, index); + xa_erase(&userq->fence_drv_xa, index); i++;
Re: [PATCH v2 07/08] drm/amdgpu: Add the missing error handling for xa_store() call
Am 25.09.24 um 21:59 schrieb Arunpravin Paneer Selvam: Add the missing error handling for xa_store() call in the function amdgpu_userq_fence_driver_alloc(). Signed-off-by: Arunpravin Paneer Selvam Reviewed-by: Christian König --- drivers/gpu/drm/amd/amdgpu/amdgpu_userq_fence.c | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_userq_fence.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_userq_fence.c index f3576c31428c..43429661f62d 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_userq_fence.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_userq_fence.c @@ -101,9 +101,11 @@ int amdgpu_userq_fence_driver_alloc(struct amdgpu_device *adev, get_task_comm(fence_drv->timeline_name, current); xa_lock_irqsave(&adev->userq_xa, flags); - __xa_store(&adev->userq_xa, userq->doorbell_index, - fence_drv, GFP_KERNEL); + r = xa_err(__xa_store(&adev->userq_xa, userq->doorbell_index, + fence_drv, GFP_KERNEL)); xa_unlock_irqrestore(&adev->userq_xa, flags); + if (r) + goto free_seq64; userq->fence_drv = fence_drv;
Re: [PATCH v2 08/08] drm/amdgpu: add vm root BO lock before accessing the vm
Am 25.09.24 um 21:59 schrieb Arunpravin Paneer Selvam: Add a vm root BO lock before accessing the userqueue VM. Signed-off-by: Arunpravin Paneer Selvam --- .../gpu/drm/amd/amdgpu/amdgpu_userq_fence.c | 19 +++ 1 file changed, 11 insertions(+), 8 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_userq_fence.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_userq_fence.c index 43429661f62d..52722b738316 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_userq_fence.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_userq_fence.c @@ -320,7 +320,6 @@ static const struct dma_fence_ops amdgpu_userq_fence_ops = { /** * amdgpu_userq_fence_read_wptr - Read the userq wptr value * - * @filp: drm file private data structure * @queue: user mode queue structure pointer * @wptr: write pointer value * @@ -330,23 +329,27 @@ static const struct dma_fence_ops amdgpu_userq_fence_ops = { * * Returns wptr value on success, error on failure. */ -static int amdgpu_userq_fence_read_wptr(struct drm_file *filp, - struct amdgpu_usermode_queue *queue, +static int amdgpu_userq_fence_read_wptr(struct amdgpu_usermode_queue *queue, u64 *wptr) { - struct amdgpu_fpriv *fpriv = filp->driver_priv; struct amdgpu_bo_va_mapping *mapping; - struct amdgpu_vm *vm = &fpriv->vm; struct amdgpu_bo *bo; u64 addr, *ptr; int r; + r = amdgpu_bo_reserve(queue->vm->root.bo, false); + if (r) + return r; + addr = queue->userq_prop->wptr_gpu_addr; addr &= AMDGPU_GMC_HOLE_MASK; - mapping = amdgpu_vm_bo_lookup_mapping(vm, addr >> PAGE_SHIFT); - if (!mapping) + mapping = amdgpu_vm_bo_lookup_mapping(queue->vm, addr >> PAGE_SHIFT); + amdgpu_bo_unreserve(queue->vm->root.bo); You need to keep the VM locked until you are done with the mapping. Otherwise the mapping could be released at any time. Regards, Christian. + if (!mapping) { + DRM_ERROR("Failed to lookup amdgpu_bo_va_mapping\n"); return -EINVAL; + } bo = mapping->bo_va->base.bo; If you only need the BO then grab a temporary BO reference here, drop the VM lock and acquire the BO. When you are done with everything just drop the BO lock and then the temporary BO reference. Regards, Christian. r = amdgpu_bo_reserve(bo, true); @@ -448,7 +451,7 @@ int amdgpu_userq_signal_ioctl(struct drm_device *dev, void *data, goto exec_fini; } - r = amdgpu_userq_fence_read_wptr(filp, queue, &wptr); + r = amdgpu_userq_fence_read_wptr(queue, &wptr); if (r) goto exec_fini;
Re: [PATCH v2 02/08] drm/amdgpu: screen freeze and userq driver crash
Am 25.09.24 um 21:59 schrieb Arunpravin Paneer Selvam: Screen freeze and userq fence driver crash while playing Xonotic v2: (Christian) - There is change that fence might signal in between testing and grabbing the lock. Hence we can move the lock above the if..else check and use the dma_fence_is_signaled_locked(). Signed-off-by: Arunpravin Paneer Selvam Reviewed-by: Christian König --- drivers/gpu/drm/amd/amdgpu/amdgpu_userq_fence.c | 7 --- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_userq_fence.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_userq_fence.c index 96d1caf4c815..97b1af574407 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_userq_fence.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_userq_fence.c @@ -185,6 +185,7 @@ int amdgpu_userq_fence_create(struct amdgpu_usermode_queue *userq, struct amdgpu_userq_fence_driver *fence_drv; struct amdgpu_userq_fence *userq_fence; struct dma_fence *fence; + unsigned long flags; fence_drv = userq->fence_drv; if (!fence_drv) @@ -232,14 +233,14 @@ int amdgpu_userq_fence_create(struct amdgpu_usermode_queue *userq, userq_fence->fence_drv_array_count = 0; } - spin_lock(&fence_drv->fence_list_lock); /* Check if hardware has already processed the job */ - if (!dma_fence_is_signaled(fence)) + spin_lock_irqsave(&fence_drv->fence_list_lock, flags); + if (!dma_fence_is_signaled_locked(fence)) list_add_tail(&userq_fence->link, &fence_drv->fences); else dma_fence_put(fence); - spin_unlock(&fence_drv->fence_list_lock); + spin_unlock_irqrestore(&fence_drv->fence_list_lock, flags); *f = fence;
Re: [PATCH] drm/amdgpu: bump driver version for cleared VRAM
Ping? On Fri, Sep 6, 2024 at 2:04 PM Alex Deucher wrote: > > Driver now clears VRAM on allocation. Bump the > driver version so mesa knows when it will get > cleared vram by default. > > Signed-off-by: Alex Deucher > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c > index d90473aec942..fad556be840b 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c > @@ -117,9 +117,10 @@ > * - 3.56.0 - Update IB start address and size alignment for decode and > encode > * - 3.57.0 - Compute tunneling on GFX10+ > * - 3.58.0 - Add GFX12 DCC support > + * - 3.59.0 - Cleared VRAM > */ > #define KMS_DRIVER_MAJOR 3 > -#define KMS_DRIVER_MINOR 58 > +#define KMS_DRIVER_MINOR 59 > #define KMS_DRIVER_PATCHLEVEL 0 > > /* > -- > 2.46.0 >
RE: [PATCH] drm/amdgpu: bump driver version for cleared VRAM
[Public] Reviewed-by: Rajneesh Bhardwaj -Original Message- From: amd-gfx On Behalf Of Alex Deucher Sent: Thursday, September 26, 2024 9:46 AM To: Deucher, Alexander ; Olsak, Marek Cc: amd-gfx@lists.freedesktop.org Subject: Re: [PATCH] drm/amdgpu: bump driver version for cleared VRAM Ping? On Fri, Sep 6, 2024 at 2:04 PM Alex Deucher wrote: > > Driver now clears VRAM on allocation. Bump the driver version so mesa > knows when it will get cleared vram by default. > > Signed-off-by: Alex Deucher > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c > index d90473aec942..fad556be840b 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c > @@ -117,9 +117,10 @@ > * - 3.56.0 - Update IB start address and size alignment for decode and > encode > * - 3.57.0 - Compute tunneling on GFX10+ > * - 3.58.0 - Add GFX12 DCC support > + * - 3.59.0 - Cleared VRAM > */ > #define KMS_DRIVER_MAJOR 3 > -#define KMS_DRIVER_MINOR 58 > +#define KMS_DRIVER_MINOR 59 > #define KMS_DRIVER_PATCHLEVEL 0 > > /* > -- > 2.46.0 >
Re: [PATCH] drm/amdgpu: bump driver version for cleared VRAM
Is GTT cleared too? Marek On Thu, Sep 26, 2024, 09:53 Alex Deucher wrote: > Ping? > > On Fri, Sep 6, 2024 at 2:04 PM Alex Deucher > wrote: > > > > Driver now clears VRAM on allocation. Bump the > > driver version so mesa knows when it will get > > cleared vram by default. > > > > Signed-off-by: Alex Deucher > > --- > > drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 3 ++- > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c > > index d90473aec942..fad556be840b 100644 > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c > > @@ -117,9 +117,10 @@ > > * - 3.56.0 - Update IB start address and size alignment for decode and > encode > > * - 3.57.0 - Compute tunneling on GFX10+ > > * - 3.58.0 - Add GFX12 DCC support > > + * - 3.59.0 - Cleared VRAM > > */ > > #define KMS_DRIVER_MAJOR 3 > > -#define KMS_DRIVER_MINOR 58 > > +#define KMS_DRIVER_MINOR 59 > > #define KMS_DRIVER_PATCHLEVEL 0 > > > > /* > > -- > > 2.46.0 > > >
Re: [PATCH] drm/amdgpu: bump driver version for cleared VRAM
On Thu, Sep 26, 2024 at 10:00 AM Marek Olšák wrote: > > Is GTT cleared too? GTT has always been cleared since it's system memory. Alex > > Marek > > On Thu, Sep 26, 2024, 09:53 Alex Deucher wrote: >> >> Ping? >> >> On Fri, Sep 6, 2024 at 2:04 PM Alex Deucher >> wrote: >> > >> > Driver now clears VRAM on allocation. Bump the >> > driver version so mesa knows when it will get >> > cleared vram by default. >> > >> > Signed-off-by: Alex Deucher >> > --- >> > drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 3 ++- >> > 1 file changed, 2 insertions(+), 1 deletion(-) >> > >> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c >> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c >> > index d90473aec942..fad556be840b 100644 >> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c >> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c >> > @@ -117,9 +117,10 @@ >> > * - 3.56.0 - Update IB start address and size alignment for decode and >> > encode >> > * - 3.57.0 - Compute tunneling on GFX10+ >> > * - 3.58.0 - Add GFX12 DCC support >> > + * - 3.59.0 - Cleared VRAM >> > */ >> > #define KMS_DRIVER_MAJOR 3 >> > -#define KMS_DRIVER_MINOR 58 >> > +#define KMS_DRIVER_MINOR 59 >> > #define KMS_DRIVER_PATCHLEVEL 0 >> > >> > /* >> > -- >> > 2.46.0 >> >
Re: [PATCH] drm/amdgpu: simplify vram alloc logic since 2GB limitation removed
On Thu, Sep 26, 2024 at 10:45 AM Zhang, Yifan wrote: > > [AMD Official Use Only - AMD Internal Distribution Only] > > -Original Message- > From: Alex Deucher > Sent: Thursday, September 26, 2024 1:17 AM > To: Zhang, Yifan > Cc: amd-gfx@lists.freedesktop.org; Yang, Philip ; > Kuehling, Felix > Subject: Re: [PATCH] drm/amdgpu: simplify vram alloc logic since 2GB > limitation removed > > On Tue, Sep 24, 2024 at 10:22 AM Zhang, Yifan wrote: > > > > [AMD Official Use Only - AMD Internal Distribution Only] > > > > 2GB limitation in VRAM allocation is removed in below patch. My patch is a > > follow up refine for this. The remaing_size calculation was to address the > > 2GB limitation in contiguous VRAM allocation, and no longer needed after > > the limitation is removed. > > > > Thanks. Would be good to add a reference to this commit in the commit > message so it's clear what you are talking about and also provide a bit more > of a description about why this can be cleaned up (like you did above). One > other comment below as well. > > Sure, I will send patch V2 to change commit message. > > > commit b2dba064c9bdd18c7dd39066d25453af28451dbf > > Author: Philip Yang > > Date: Fri Apr 19 16:27:00 2024 -0400 > > > > drm/amdgpu: Handle sg size limit for contiguous allocation > > > > Define macro AMDGPU_MAX_SG_SEGMENT_SIZE 2GB, because struct scatterlist > > length is unsigned int, and some users of it cast to a signed int, so > > every segment of sg table is limited to size 2GB maximum. > > > > For contiguous VRAM allocation, don't limit the max buddy block size in > > order to get contiguous VRAM memory. To workaround the sg table segment > > size limit, allocate multiple segments if contiguous size is bigger than > > AMDGPU_MAX_SG_SEGMENT_SIZE. > > > > Signed-off-by: Philip Yang > > Reviewed-by: Christian König > > Signed-off-by: Alex Deucher > > > > > > > > -Original Message- > > From: Alex Deucher > > Sent: Tuesday, September 24, 2024 10:07 PM > > To: Zhang, Yifan > > Cc: amd-gfx@lists.freedesktop.org; Yang, Philip ; > > Kuehling, Felix > > Subject: Re: [PATCH] drm/amdgpu: simplify vram alloc logic since 2GB > > limitation removed > > > > On Mon, Sep 23, 2024 at 4:28 AM Yifan Zhang wrote: > > > > > > Make vram alloc loop simpler after 2GB limitation removed. > > > > Can you provide more context? What 2GB limitation are you referring to? > > > > Alex > > > > > > > > Signed-off-by: Yifan Zhang > > > --- > > > drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c | 15 +-- > > > 1 file changed, 5 insertions(+), 10 deletions(-) > > > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c > > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c > > > index 7d26a962f811..3d129fd61fa7 100644 > > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c > > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c > > > @@ -455,7 +455,7 @@ static int amdgpu_vram_mgr_new(struct > > > ttm_resource_manager *man, > > > struct amdgpu_bo *bo = ttm_to_amdgpu_bo(tbo); > > > u64 vis_usage = 0, max_bytes, min_block_size; > > > struct amdgpu_vram_mgr_resource *vres; > > > - u64 size, remaining_size, lpfn, fpfn; > > > + u64 size, lpfn, fpfn; > > > unsigned int adjust_dcc_size = 0; > > > struct drm_buddy *mm = &mgr->mm; > > > struct drm_buddy_block *block; @@ -516,25 +516,23 @@ static > > > int amdgpu_vram_mgr_new(struct ttm_resource_manager *man, > > > adev->gmc.gmc_funcs->get_dcc_alignment) > > > adjust_dcc_size = > > > amdgpu_gmc_get_dcc_alignment(adev); > > > > > > - remaining_size = (u64)vres->base.size; > > > + size = (u64)vres->base.size; > > > if (bo->flags & AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS && > > > adjust_dcc_size) { > > > unsigned int dcc_size; > > > > > > dcc_size = roundup_pow_of_two(vres->base.size + > > > adjust_dcc_size); > > > - remaining_size = (u64)dcc_size; > > > + size = (u64)dcc_size; > > > > > > vres->flags |= DRM_BUDDY_TRIM_DISABLE; > > > } > > > > > > mutex_lock(&mgr->lock); > > > - while (remaining_size) { > > > + while (true) { > > I think you can also drop this while loop now too. > > Alex > > There is still a "continue" path left in the loop, I think the logic may be > broken here if loop dropped. Ah, yes, sorry I missed that continue. Alex > >r = drm_buddy_alloc_blocks(mm, fpfn, > lpfn, > size, > min_block_size, > &vres->blocks, > vres->flags); > > if (unlikely(r == -ENOSPC) && pages_per_block == ~0ul && > !(place->flags & TTM_PL_FLAG_CONTIGUOUS)) { > vres->flags &= ~DRM_BUDDY_CONTIGUOUS_ALLOCATION; > pages_per_block = max_t(u32, 2
Re: [PATCH v7 07/10] drm/amd/display: get SAD from drm_eld when parsing EDID caps
Mario and Melissa, Another regression identified on this patch - DP Display is not listed as an audio device after this patch is applied. Cheers, Alex Hung On 9/18/24 15:38, Mario Limonciello wrote: From: Melissa Wen drm_edid_connector_update() updates display info, filling ELD with audio info from Short-Audio Descriptors in the last step of update_dislay_info(). Our goal is stopping using raw edid, so we can extract SAD from drm_eld instead of access raw edid to get audio caps. Signed-off-by: Melissa Wen Signed-off-by: Mario Limonciello --- .../amd/display/amdgpu_dm/amdgpu_dm_helpers.c | 20 +-- 1 file changed, 10 insertions(+), 10 deletions(-) diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c index 854e51c0b3fb..e0326127fd9a 100644 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c @@ -98,9 +98,7 @@ enum dc_edid_status dm_helpers_parse_edid_caps( const struct drm_edid *drm_edid = aconnector->drm_edid; struct drm_edid_product_id product_id; struct edid *edid_buf = edid ? (struct edid *) edid->raw_edid : NULL; - struct cea_sad *sads; - int sad_count = -1; - int sadb_count = -1; + int sad_count, sadb_count; int i = 0; uint8_t *sadb = NULL; @@ -129,18 +127,21 @@ enum dc_edid_status dm_helpers_parse_edid_caps( apply_edid_quirks(&product_id, edid_caps); - sad_count = drm_edid_to_sad((struct edid *) edid->raw_edid, &sads); + sad_count = drm_eld_sad_count(connector->eld); if (sad_count <= 0) return result; edid_caps->audio_mode_count = min(sad_count, DC_MAX_AUDIO_DESC_COUNT); for (i = 0; i < edid_caps->audio_mode_count; ++i) { - struct cea_sad *sad = &sads[i]; + struct cea_sad sad; - edid_caps->audio_modes[i].format_code = sad->format; - edid_caps->audio_modes[i].channel_count = sad->channels + 1; - edid_caps->audio_modes[i].sample_rate = sad->freq; - edid_caps->audio_modes[i].sample_size = sad->byte2; + if (drm_eld_sad_get(connector->eld, i, &sad) < 0) + continue; + + edid_caps->audio_modes[i].format_code = sad.format; + edid_caps->audio_modes[i].channel_count = sad.channels + 1; + edid_caps->audio_modes[i].sample_rate = sad.freq; + edid_caps->audio_modes[i].sample_size = sad.byte2; } sadb_count = drm_edid_to_speaker_allocation((struct edid *) edid->raw_edid, &sadb); @@ -155,7 +156,6 @@ enum dc_edid_status dm_helpers_parse_edid_caps( else edid_caps->speaker_flags = DEFAULT_SPEAKER_LOCATION; - kfree(sads); kfree(sadb); return result;
Re: [PATCH 6/7] drm/amdgpu: Check gmc requirement for reset on init
>>I guess you are referring to the below corner case 1) Place NPS request 2) Unload Driver 3) Reinstall driver with a different TOS (possible but quite unlikely) 4) Driver reload 5) Driver checks TOS version first and goes for a reset 6) reset_flag of GMC is not set, hence it doesn't refresh the NPS range. >>I think changing the order in soc15_need_reset_on_init() to check for NPS request before TOS version check will solve this. Yes, I was thinking of reset_flag and tOS reloading (adev->init_lvl->level set to AMDGPU_INIT_LEVEL_MINIMAL_XGMI) changing at the same time. And NPS refresh will be ignored. Though might be likely in debugging or regression isolation cases which changing driver packaged with different TOS. And yes making NPS refresh checking before TOS version checking will solve this. And if we do not return ahead when checking NPS request before tOS version change in soc15_need_reset_on_init(), we can drop (adev->init_lvl->level != AMDGPU_INIT_LEVEL_MINIMAL_XGMI) check in below refresh checking: + refresh = (adev->init_lvl->level != AMDGPU_INIT_LEVEL_MINIMAL_XGMI) && + (adev->gmc.reset_flags & AMDGPU_GMC_INIT_RESET_NPS); refresh = (adev->gmc.reset_flags & AMDGPU_GMC_INIT_RESET_NPS); Thanks Feifei On 9/26/2024 5:01 PM, Xu, Feifei wrote: + refresh = (adev->init_lvl->level != AMDGPU_INIT_LEVEL_MINIMAL_XGMI) && + (adev->gmc.reset_flags & AMDGPU_GMC_INIT_RESET_NPS); Is there a corner case that reloading with a different version tos and refreshing nps change co-exist? Thanks, Feifei -Original Message- From: amd-gfx On Behalf Of Lijo Lazar Sent: Tuesday, September 24, 2024 1:57 PM To:amd-gfx@lists.freedesktop.org Cc: Zhang, Hawking; Deucher, Alexander; Koenig, Christian; Bhardwaj, Rajneesh; Errabolu, Ramesh Subject: [PATCH 6/7] drm/amdgpu: Check gmc requirement for reset on init Add a callback to check if there is any condition detected by GMC block for reset on init. One case is if a pending NPS change request is detected. If reset is done because of NPS switch, refresh NPS info from discovery table. Signed-off-by: Lijo Lazar --- drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 13 - drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h | 5 + drivers/gpu/drm/amd/amdgpu/soc15.c | 2 ++ 3 files changed, 19 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c index 21f1e65c9dc9..011fe3a847d0 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c @@ -1261,12 +1261,15 @@ int amdgpu_gmc_get_nps_memranges(struct amdgpu_device *adev, struct amdgpu_gmc_memrange *ranges; int range_cnt, ret, i, j; uint32_t nps_type; + bool refresh; if (!mem_ranges) return -EINVAL; + refresh = (adev->init_lvl->level != AMDGPU_INIT_LEVEL_MINIMAL_XGMI) && + (adev->gmc.reset_flags & AMDGPU_GMC_INIT_RESET_NPS); ret = amdgpu_discovery_get_nps_info(adev, &nps_type, &ranges, - &range_cnt, false); + &range_cnt, refresh); if (ret) return ret; @@ -1392,3 +1395,11 @@ void amdgpu_gmc_prepare_nps_mode_change(struct amdgpu_device *adev) adev->dev, "NPS mode change request done, reload driver to complete the change\n"); } + +bool amdgpu_gmc_need_reset_on_init(struct amdgpu_device *adev) { + if (adev->gmc.gmc_funcs->need_reset_on_init) + return adev->gmc.gmc_funcs->need_reset_on_init(adev); + + return false; +} diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h index b13d6adb5efd..d4cd247fe574 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h @@ -78,6 +78,8 @@ enum amdgpu_memory_partition { BIT(AMDGPU_NPS3_PARTITION_MODE) | BIT(AMDGPU_NPS4_PARTITION_MODE) | \ BIT(AMDGPU_NPS6_PARTITION_MODE) | BIT(AMDGPU_NPS8_PARTITION_MODE)) +#define AMDGPU_GMC_INIT_RESET_NPS BIT(0) + /* * GMC page fault information */ @@ -169,6 +171,7 @@ struct amdgpu_gmc_funcs { /* Request NPS mode */ int (*request_mem_partition_mode)(struct amdgpu_device *adev, int nps_mode); + bool (*need_reset_on_init)(struct amdgpu_device *adev); }; struct amdgpu_xgmi_ras { @@ -314,6 +317,7 @@ struct amdgpu_gmc { const struct amdgpu_gmc_funcs *gmc_funcs; enum amdgpu_memory_partitionrequested_nps_mode; uint32_t supported_nps_modes; + uint32_t reset_flags; struct amdgpu_xgmi xgmi; struct amdgpu_irq_src ecc_irq; @@ -468,5 +472,6 @@ int amdgpu_gmc_get_nps_memranges(struct amdgpu_device *adev, int amdgpu_gmc_request_memory_partition(struct amdg
[PATCH v3] docs/gpu: ci: update flake tests requirements
Update the documentation to specify linking to a relevant GitLab issue or email report for each new flake entry. Added specific GitLab issue urls for i915, msm and amdgpu driver. Acked-by: Abhinav Kumar # msm Acked-by: Dmitry Baryshkov # msm Signed-off-by: Vignesh Raman --- v2: - Add gitlab issue link for msm driver. v3: - Update docs to specify we use email reporting or GitLab issues for flake entries. --- Documentation/gpu/automated_testing.rst | 13 + 1 file changed, 9 insertions(+), 4 deletions(-) diff --git a/Documentation/gpu/automated_testing.rst b/Documentation/gpu/automated_testing.rst index 2d5a28866afe..03769b4a17cf 100644 --- a/Documentation/gpu/automated_testing.rst +++ b/Documentation/gpu/automated_testing.rst @@ -68,19 +68,24 @@ known to behave unreliably. These tests won't cause a job to fail regardless of the result. They will still be run. Each new flake entry must be associated with a link to the email reporting the -bug to the author of the affected driver, the board name or Device Tree name of -the board, the first kernel version affected, the IGT version used for tests, -and an approximation of the failure rate. +bug to the author of the affected driver or the relevant GitLab issue. The entry +must also include the board name or Device Tree name, the first kernel version +affected, the IGT version used for tests, and an approximation of the failure rate. They should be provided under the following format:: - # Bug Report: $LORE_OR_PATCHWORK_URL + # Bug Report: $LORE_URL_OR_GITLAB_ISSUE # Board Name: broken-board.dtb # Linux Version: 6.6-rc1 # IGT Version: 1.28-gd2af13d9f # Failure Rate: 100 flaky-test +Use the appropriate link below to create a GitLab issue: +amdgpu driver: https://gitlab.freedesktop.org/drm/amd/-/issues +i915 driver: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues +msm driver: https://gitlab.freedesktop.org/drm/msm/-/issues + drivers/gpu/drm/ci/${DRIVER_NAME}-${HW_REVISION}-skips.txt --- -- 2.43.0
Re: [PATCH v2] docs/gpu: ci: update flake tests requirements
On 9/26/2024 12:06 AM, Vignesh Raman wrote: Update the documentation to require linking to a relevant GitLab issue for each new flake entry instead of an email report. Added specific GitLab issue URLs for i915, xe and other drivers. Signed-off-by: Vignesh Raman --- v2: - Add gitlab issue link for msm driver. --- Documentation/gpu/automated_testing.rst | 16 +++- 1 file changed, 11 insertions(+), 5 deletions(-) diff --git a/Documentation/gpu/automated_testing.rst b/Documentation/gpu/automated_testing.rst index 2d5a28866afe..f918fe56f2b0 100644 --- a/Documentation/gpu/automated_testing.rst +++ b/Documentation/gpu/automated_testing.rst @@ -67,20 +67,26 @@ Lists the tests that for a given driver on a specific hardware revision are known to behave unreliably. These tests won't cause a job to fail regardless of the result. They will still be run. -Each new flake entry must be associated with a link to the email reporting the -bug to the author of the affected driver, the board name or Device Tree name of -the board, the first kernel version affected, the IGT version used for tests, -and an approximation of the failure rate. +Each new flake entry must include a link to the relevant GitLab issue, the board +name or Device Tree name, the first kernel version affected, the IGT version used +for tests and an approximation of the failure rate. They should be provided under the following format:: - # Bug Report: $LORE_OR_PATCHWORK_URL + # Bug Report: $GITLAB_ISSUE # Board Name: broken-board.dtb # Linux Version: 6.6-rc1 # IGT Version: 1.28-gd2af13d9f # Failure Rate: 100 flaky-test +The GitLab issue must include the logs and the pipeline link. Use the appropriate +link below to create an issue. +https://gitlab.freedesktop.org/drm/i915/kernel/-/issues for i915 driver +https://gitlab.freedesktop.org/drm/xe/kernel/-/issues for xe driver +https://gitlab.freedesktop.org/drm/msm/-/issues for msm driver +https://gitlab.freedesktop.org/drm/misc/kernel/-/issues for other drivers + Acked for MSM. Thanks Abhinav drivers/gpu/drm/ci/${DRIVER_NAME}-${HW_REVISION}-skips.txt ---
Re: [PATCH 6/7] drm/amdgpu: Check gmc requirement for reset on init
On 9/27/2024 9:26 AM, Xu, Feifei wrote: >>>I guess you are referring to the below corner case > >>> 1) Place NPS request > >>> 2) Unload Driver > >>> 3) Reinstall driver with a different TOS (possible but quite > unlikely) > >>> 4) Driver reload > >>> 5) Driver checks TOS version first and goes for a reset > >>> 6) reset_flag of GMC is not set, hence it doesn't refresh > the NPS range. > > > > > >>>I think changing the order in soc15_need_reset_on_init() to check for > NPS request before TOS version check will solve this. > > Yes, I was thinking of reset_flag and tOS reloading > (adev->init_lvl->level set to AMDGPU_INIT_LEVEL_MINIMAL_XGMI) changing > at the same time. And NPS refresh will be ignored. Though might be > likely in debugging or regression isolation cases which changing driver > packaged with different TOS. And yes making NPS refresh checking before > TOS version checking will solve this. > > And if we do not return ahead when checking NPS request before tOS > version change in soc15_need_reset_on_init(), we can drop > (adev->init_lvl->level != AMDGPU_INIT_LEVEL_MINIMAL_XGMI) check in below > refresh checking: > This check is used when NPS request change is identified. During swinit part it will be at MINIMAL_XGMI level, but at that point there is no need to refresh this information as reset is pending. It is refreshed after a reset when init level returns to default. Thanks, Lijo > + refresh = (adev->init_lvl->level != AMDGPU_INIT_LEVEL_MINIMAL_XGMI) && > + (adev->gmc.reset_flags & AMDGPU_GMC_INIT_RESET_NPS); > > refresh = (adev->gmc.reset_flags & AMDGPU_GMC_INIT_RESET_NPS); > > Thanks > Feifei > > On 9/26/2024 5:01 PM, Xu, Feifei wrote: + refresh = (adev->init_lvl->level != AMDGPU_INIT_LEVEL_MINIMAL_XGMI) && + (adev->gmc.reset_flags & AMDGPU_GMC_INIT_RESET_NPS); >> Is there a corner case that reloading with a different version tos and >> refreshing nps change co-exist? >> >> Thanks, >> Feifei >> >> -Original Message- >> From: amd-gfx On Behalf Of Lijo Lazar >> Sent: Tuesday, September 24, 2024 1:57 PM >> To: amd-gfx@lists.freedesktop.org >> Cc: Zhang, Hawking ; Deucher, Alexander >> ; Koenig, Christian ; >> Bhardwaj, Rajneesh ; Errabolu, Ramesh >> >> Subject: [PATCH 6/7] drm/amdgpu: Check gmc requirement for reset on init >> >> Add a callback to check if there is any condition detected by GMC block for >> reset on init. One case is if a pending NPS change request is detected. If >> reset is done because of NPS switch, refresh NPS info from discovery table. >> >> Signed-off-by: Lijo Lazar >> --- >> drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 13 - >> drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h | 5 + >> drivers/gpu/drm/amd/amdgpu/soc15.c | 2 ++ >> 3 files changed, 19 insertions(+), 1 deletion(-) >> >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c >> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c >> index 21f1e65c9dc9..011fe3a847d0 100644 >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c >> @@ -1261,12 +1261,15 @@ int amdgpu_gmc_get_nps_memranges(struct >> amdgpu_device *adev, >> struct amdgpu_gmc_memrange *ranges; >> int range_cnt, ret, i, j; >> uint32_t nps_type; >> + bool refresh; >> >> if (!mem_ranges) >> return -EINVAL; >> >> + refresh = (adev->init_lvl->level != AMDGPU_INIT_LEVEL_MINIMAL_XGMI) >> && >> + (adev->gmc.reset_flags & AMDGPU_GMC_INIT_RESET_NPS); >> ret = amdgpu_discovery_get_nps_info(adev, &nps_type, &ranges, >> - &range_cnt, false); >> + &range_cnt, refresh); >> >> if (ret) >> return ret; >> @@ -1392,3 +1395,11 @@ void amdgpu_gmc_prepare_nps_mode_change(struct >> amdgpu_device *adev) >> adev->dev, >> "NPS mode change request done, reload driver to >> complete the change\n"); } >> + >> +bool amdgpu_gmc_need_reset_on_init(struct amdgpu_device *adev) { >> + if (adev->gmc.gmc_funcs->need_reset_on_init) >> + return adev->gmc.gmc_funcs->need_reset_on_init(adev); >> + >> + return false; >> +} >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h >> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h >> index b13d6adb5efd..d4cd247fe574 100644 >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h >> @@ -78,6 +78,8 @@ enum amdgpu_memory_partition { >> BIT(AMDGPU_NPS3_PARTITION_MODE) | BIT(AMDGPU_NPS4_PARTITION_MODE) | >> \ >> BIT(AMDGPU_NPS6_PARTITION_MODE) | BIT(AMDGPU_NPS8_PARTITION_MODE)) >> >> +#define AMDGPU_GMC_INIT_RESET_NPS BIT(0) >> + >> /* >> * GMC page fault information >> */ >> @@ -169,6 +171,7 @@ struct amdgpu_gmc_fun
[PATCH v2 0/7] Add support for dynamic NPS switch
This series adds supports for dynamic NPS switch on GC v9.4.3/9.4.4 SOC variants. In order to do dynamic NPS switch a sysfs interface is provided to request a new NPS mode. If the device is part of a hive, all hive devices are required to be in the same NPS mode. Hence a hive device request is saved in a hive variable. For individual device, it's saved in a gmc block variable. In order to do a NPS mode switch, the workflow is - 1) User places a requests through sysfs node. 2) User unloads the driver 3) During unload, driver checks for any pending NPS switch request. If any request is pending, it places the request to PSP FW. 4) For a hive, request is placed in one-go for all devices in the hive. If one of the requests fails, a request is placed again to revert to current NPS mode on the successful devices. 5) User reloads the driver. 6) On reload, driver checks if NPS switch is pending and initiates a mode-1 reset. 7) During resume after a reset, NPS ranges are read again from discovery table. 8) Driver detects the new NPS mode and makes a compatible compute partition mode switch if required. v2: Move NPS request check ahead of TOS reload requirement check (Feifei) Lijo Lazar (7): drm/amdgpu: Add option to refresh NPS data drm/amdgpu: Add PSP interface for NPS switch drm/amdgpu: Add gmc interface to request NPS mode drm/amdgpu: Add sysfs interfaces for NPS mode drm/amdgpu: Place NPS mode request on unload drm/amdgpu: Check gmc requirement for reset on init drm/amdgpu: Add NPS switch support for GC 9.4.3 drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c | 68 +-- drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.h | 2 +- drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 1 + drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 190 -- drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h | 19 ++ drivers/gpu/drm/amd/amdgpu/amdgpu_nbio.h | 1 + drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 25 +++ drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h | 1 + drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c | 39 drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h | 5 + drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 44 drivers/gpu/drm/amd/amdgpu/nbio_v7_9.c| 12 ++ drivers/gpu/drm/amd/amdgpu/psp_gfx_if.h | 14 +- drivers/gpu/drm/amd/amdgpu/soc15.c| 2 + 14 files changed, 387 insertions(+), 36 deletions(-) -- 2.25.1
[PATCH v2 2/7] drm/amdgpu: Add PSP interface for NPS switch
Implement PSP ring command interface for memory partitioning on the fly on the supported asics. Signed-off-by: Rajneesh Bhardwaj Reviewed-by: Feifei Xu --- drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 25 + drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h | 1 + drivers/gpu/drm/amd/amdgpu/psp_gfx_if.h | 14 +++--- 3 files changed, 37 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c index 944dad9ad29f..04be0fabb4f5 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c @@ -1045,6 +1045,31 @@ static int psp_rl_load(struct amdgpu_device *adev) return ret; } +int psp_memory_partition(struct psp_context *psp, int mode) +{ + struct psp_gfx_cmd_resp *cmd; + int ret; + + if (amdgpu_sriov_vf(psp->adev)) + return 0; + + cmd = acquire_psp_cmd_buf(psp); + + cmd->cmd_id = GFX_CMD_ID_FB_NPS_MODE; + cmd->cmd.cmd_memory_part.mode = mode; + + dev_info(psp->adev->dev, +"Requesting %d memory partition change through PSP", mode); + ret = psp_cmd_submit_buf(psp, NULL, cmd, psp->fence_buf_mc_addr); + if (ret) + dev_err(psp->adev->dev, + "PSP request failed to change to NPS%d mode\n", mode); + + release_psp_cmd_buf(psp); + + return ret; +} + int psp_spatial_partition(struct psp_context *psp, int mode) { struct psp_gfx_cmd_resp *cmd; diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h index 76fa18ffc045..567cb1f924ca 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h @@ -553,6 +553,7 @@ int psp_load_fw_list(struct psp_context *psp, void psp_copy_fw(struct psp_context *psp, uint8_t *start_addr, uint32_t bin_size); int psp_spatial_partition(struct psp_context *psp, int mode); +int psp_memory_partition(struct psp_context *psp, int mode); int is_psp_fw_valid(struct psp_bin_desc bin); diff --git a/drivers/gpu/drm/amd/amdgpu/psp_gfx_if.h b/drivers/gpu/drm/amd/amdgpu/psp_gfx_if.h index 604301371e4f..f4a91b126c73 100644 --- a/drivers/gpu/drm/amd/amdgpu/psp_gfx_if.h +++ b/drivers/gpu/drm/amd/amdgpu/psp_gfx_if.h @@ -103,8 +103,10 @@ enum psp_gfx_cmd_id GFX_CMD_ID_AUTOLOAD_RLC = 0x0021, /* Indicates all graphics fw loaded, start RLC autoload */ GFX_CMD_ID_BOOT_CFG = 0x0022, /* Boot Config */ GFX_CMD_ID_SRIOV_SPATIAL_PART = 0x0027, /* Configure spatial partitioning mode */ - /*IDs of performance monitoring/profiling*/ - GFX_CMD_ID_CONFIG_SQ_PERFMON = 0x0046, /* Config CGTT_SQ_CLK_CTRL */ +/*IDs of performance monitoring/profiling*/ +GFX_CMD_ID_CONFIG_SQ_PERFMON = 0x0046, /* Config CGTT_SQ_CLK_CTRL */ +/* Dynamic memory partitioninig (NPS mode change)*/ +GFX_CMD_ID_FB_NPS_MODE = 0x0048, /* Configure memory partitioning mode */ }; /* PSP boot config sub-commands */ @@ -362,6 +364,11 @@ struct psp_gfx_cmd_config_sq_perfmon { uint8_t reserved[5]; }; +struct psp_gfx_cmd_fb_memory_part { + uint32_t mode; /* requested NPS mode */ + uint32_t resvd; +}; + /* All GFX ring buffer commands. */ union psp_gfx_commands { @@ -376,7 +383,8 @@ union psp_gfx_commands struct psp_gfx_cmd_load_toc cmd_load_toc; struct psp_gfx_cmd_boot_cfg boot_cfg; struct psp_gfx_cmd_sriov_spatial_part cmd_spatial_part; - struct psp_gfx_cmd_config_sq_perfmon config_sq_perfmon; +struct psp_gfx_cmd_config_sq_perfmon config_sq_perfmon; +struct psp_gfx_cmd_fb_memory_part cmd_memory_part; }; struct psp_gfx_uresp_reserved -- 2.25.1
[PATCH v2 1/7] drm/amdgpu: Add option to refresh NPS data
In certain use cases, NPS data needs to be refreshed again from discovery table. Add API parameter to refresh NPS data from discovery table. Signed-off-by: Lijo Lazar Reviewed-by: Rajneesh Bhardwaj --- drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c | 68 +++ drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.h | 2 +- drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 2 +- 3 files changed, 55 insertions(+), 17 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c index 4bd61c169ca8..9f9a1867da72 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c @@ -1723,37 +1723,75 @@ union nps_info { struct nps_info_v1_0 v1; }; +static int amdgpu_discovery_refresh_nps_info(struct amdgpu_device *adev, +union nps_info *nps_data) +{ + uint64_t vram_size, pos, offset; + struct nps_info_header *nhdr; + struct binary_header bhdr; + uint16_t checksum; + + vram_size = (uint64_t)RREG32(mmRCC_CONFIG_MEMSIZE) << 20; + pos = vram_size - DISCOVERY_TMR_OFFSET; + amdgpu_device_vram_access(adev, pos, &bhdr, sizeof(bhdr), false); + + offset = le16_to_cpu(bhdr.table_list[NPS_INFO].offset); + checksum = le16_to_cpu(bhdr.table_list[NPS_INFO].checksum); + + amdgpu_device_vram_access(adev, (pos + offset), nps_data, + sizeof(*nps_data), false); + + nhdr = (struct nps_info_header *)(nps_data); + if (!amdgpu_discovery_verify_checksum((uint8_t *)nps_data, + le32_to_cpu(nhdr->size_bytes), + checksum)) { + dev_err(adev->dev, "nps data refresh, checksum mismatch\n"); + return -EINVAL; + } + + return 0; +} + int amdgpu_discovery_get_nps_info(struct amdgpu_device *adev, uint32_t *nps_type, struct amdgpu_gmc_memrange **ranges, - int *range_cnt) + int *range_cnt, bool refresh) { struct amdgpu_gmc_memrange *mem_ranges; struct binary_header *bhdr; union nps_info *nps_info; + union nps_info nps_data; u16 offset; - int i; + int i, r; if (!nps_type || !range_cnt || !ranges) return -EINVAL; - if (!adev->mman.discovery_bin) { - dev_err(adev->dev, - "fetch mem range failed, ip discovery uninitialized\n"); - return -EINVAL; - } + if (refresh) { + r = amdgpu_discovery_refresh_nps_info(adev, &nps_data); + if (r) + return r; + nps_info = &nps_data; + } else { + if (!adev->mman.discovery_bin) { + dev_err(adev->dev, + "fetch mem range failed, ip discovery uninitialized\n"); + return -EINVAL; + } - bhdr = (struct binary_header *)adev->mman.discovery_bin; - offset = le16_to_cpu(bhdr->table_list[NPS_INFO].offset); + bhdr = (struct binary_header *)adev->mman.discovery_bin; + offset = le16_to_cpu(bhdr->table_list[NPS_INFO].offset); - if (!offset) - return -ENOENT; + if (!offset) + return -ENOENT; - /* If verification fails, return as if NPS table doesn't exist */ - if (amdgpu_discovery_verify_npsinfo(adev, bhdr)) - return -ENOENT; + /* If verification fails, return as if NPS table doesn't exist */ + if (amdgpu_discovery_verify_npsinfo(adev, bhdr)) + return -ENOENT; - nps_info = (union nps_info *)(adev->mman.discovery_bin + offset); + nps_info = + (union nps_info *)(adev->mman.discovery_bin + offset); + } switch (le16_to_cpu(nps_info->v1.header.version_major)) { case 1: diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.h index f5d36525ec3e..b44d56465c5b 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.h @@ -33,6 +33,6 @@ int amdgpu_discovery_set_ip_blocks(struct amdgpu_device *adev); int amdgpu_discovery_get_nps_info(struct amdgpu_device *adev, uint32_t *nps_type, struct amdgpu_gmc_memrange **ranges, - int *range_cnt); + int *range_cnt, bool refresh); #endif /* __AMDGPU_DISCOVERY__ */ diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c index 17a19d49d30a..4f088a5368d8 100644 -
[PATCH v2 5/7] drm/amdgpu: Place NPS mode request on unload
If a user has requested NPS mode switch, place the request through PSP during unload of the driver. For devices which are part of a hive, all requests are placed together. If one of them fails, revert back to the current NPS mode. Signed-off-by: Lijo Lazar Signed-off-by: Rajneesh Bhardwaj Reviewed-by: Feifei Xu --- drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 1 + drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 47 drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h | 2 + drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c | 38 +++ drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h | 4 ++ 5 files changed, 92 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c index 95331294509c..d16cdcdb2114 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c @@ -2428,6 +2428,7 @@ amdgpu_pci_remove(struct pci_dev *pdev) struct amdgpu_device *adev = drm_to_adev(dev); amdgpu_xcp_dev_unplug(adev); + amdgpu_gmc_prepare_nps_mode_change(adev); drm_dev_unplug(dev); if (adev->pm.rpm_mode != AMDGPU_RUNPM_NONE) { diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c index 24a1f931d9ed..21f1e65c9dc9 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c @@ -1345,3 +1345,50 @@ int amdgpu_gmc_request_memory_partition(struct amdgpu_device *adev, return psp_memory_partition(&adev->psp, nps_mode); } + +static inline bool amdgpu_gmc_need_nps_switch_req(struct amdgpu_device *adev, + int req_nps_mode, + int cur_nps_mode) +{ + return (((BIT(req_nps_mode) & adev->gmc.supported_nps_modes) == + BIT(req_nps_mode)) && + req_nps_mode != cur_nps_mode); +} + +void amdgpu_gmc_prepare_nps_mode_change(struct amdgpu_device *adev) +{ + int req_nps_mode, cur_nps_mode, r; + struct amdgpu_hive_info *hive; + + if (amdgpu_sriov_vf(adev) || !adev->gmc.supported_nps_modes || + !adev->gmc.gmc_funcs->request_mem_partition_mode) + return; + + cur_nps_mode = adev->gmc.gmc_funcs->query_mem_partition_mode(adev); + hive = amdgpu_get_xgmi_hive(adev); + if (hive) { + req_nps_mode = atomic_read(&hive->requested_nps_mode); + if (!amdgpu_gmc_need_nps_switch_req(adev, req_nps_mode, + cur_nps_mode)) { + amdgpu_put_xgmi_hive(hive); + return; + } + r = amdgpu_xgmi_request_nps_change(adev, hive, req_nps_mode); + amdgpu_put_xgmi_hive(hive); + goto out; + } + + req_nps_mode = adev->gmc.requested_nps_mode; + if (!amdgpu_gmc_need_nps_switch_req(adev, req_nps_mode, cur_nps_mode)) + return; + + /* even if this fails, we should let driver unload w/o blocking */ + r = adev->gmc.gmc_funcs->request_mem_partition_mode(adev, req_nps_mode); +out: + if (r) + dev_err(adev->dev, "NPS mode change request failed\n"); + else + dev_info( + adev->dev, + "NPS mode change request done, reload driver to complete the change\n"); +} diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h index 1a50639a003a..b13d6adb5efd 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h @@ -467,4 +467,6 @@ int amdgpu_gmc_get_nps_memranges(struct amdgpu_device *adev, int amdgpu_gmc_request_memory_partition(struct amdgpu_device *adev, int nps_mode); +void amdgpu_gmc_prepare_nps_mode_change(struct amdgpu_device *adev); + #endif diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c index 5d721ccb9dfd..db2c1b11b813 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c @@ -1564,3 +1564,41 @@ int amdgpu_xgmi_reset_on_init(struct amdgpu_device *adev) return 0; } + +int amdgpu_xgmi_request_nps_change(struct amdgpu_device *adev, + struct amdgpu_hive_info *hive, + int req_nps_mode) +{ + struct amdgpu_device *tmp_adev; + int cur_nps_mode, r; + + /* This is expected to be called only during unload of driver. The +* request needs to be placed only once for all devices in the hive. If +* one of them fail, revert the request for previous successful devices. +* After placing the request, make hive mode as UNKNOWN so that other +* devices don't request anymore. +*/ + mutex_lock(&hive->hive_lock); + list_for_each_entry(tmp_adev
[PATCH v2 6/7] drm/amdgpu: Check gmc requirement for reset on init
Add a callback to check if there is any condition detected by GMC block for reset on init. One case is if a pending NPS change request is detected. If reset is done because of NPS switch, refresh NPS info from discovery table. Signed-off-by: Lijo Lazar --- v2: Move NPS request check ahead of TOS reload requirement check (Feifei) drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 13 - drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h | 5 + drivers/gpu/drm/amd/amdgpu/soc15.c | 2 ++ 3 files changed, 19 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c index 21f1e65c9dc9..011fe3a847d0 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c @@ -1261,12 +1261,15 @@ int amdgpu_gmc_get_nps_memranges(struct amdgpu_device *adev, struct amdgpu_gmc_memrange *ranges; int range_cnt, ret, i, j; uint32_t nps_type; + bool refresh; if (!mem_ranges) return -EINVAL; + refresh = (adev->init_lvl->level != AMDGPU_INIT_LEVEL_MINIMAL_XGMI) && + (adev->gmc.reset_flags & AMDGPU_GMC_INIT_RESET_NPS); ret = amdgpu_discovery_get_nps_info(adev, &nps_type, &ranges, - &range_cnt, false); + &range_cnt, refresh); if (ret) return ret; @@ -1392,3 +1395,11 @@ void amdgpu_gmc_prepare_nps_mode_change(struct amdgpu_device *adev) adev->dev, "NPS mode change request done, reload driver to complete the change\n"); } + +bool amdgpu_gmc_need_reset_on_init(struct amdgpu_device *adev) +{ + if (adev->gmc.gmc_funcs->need_reset_on_init) + return adev->gmc.gmc_funcs->need_reset_on_init(adev); + + return false; +} diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h index b13d6adb5efd..d4cd247fe574 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h @@ -78,6 +78,8 @@ enum amdgpu_memory_partition { BIT(AMDGPU_NPS3_PARTITION_MODE) | BIT(AMDGPU_NPS4_PARTITION_MODE) | \ BIT(AMDGPU_NPS6_PARTITION_MODE) | BIT(AMDGPU_NPS8_PARTITION_MODE)) +#define AMDGPU_GMC_INIT_RESET_NPS BIT(0) + /* * GMC page fault information */ @@ -169,6 +171,7 @@ struct amdgpu_gmc_funcs { /* Request NPS mode */ int (*request_mem_partition_mode)(struct amdgpu_device *adev, int nps_mode); + bool (*need_reset_on_init)(struct amdgpu_device *adev); }; struct amdgpu_xgmi_ras { @@ -314,6 +317,7 @@ struct amdgpu_gmc { const struct amdgpu_gmc_funcs *gmc_funcs; enum amdgpu_memory_partitionrequested_nps_mode; uint32_t supported_nps_modes; + uint32_t reset_flags; struct amdgpu_xgmi xgmi; struct amdgpu_irq_src ecc_irq; @@ -468,5 +472,6 @@ int amdgpu_gmc_get_nps_memranges(struct amdgpu_device *adev, int amdgpu_gmc_request_memory_partition(struct amdgpu_device *adev, int nps_mode); void amdgpu_gmc_prepare_nps_mode_change(struct amdgpu_device *adev); +bool amdgpu_gmc_need_reset_on_init(struct amdgpu_device *adev); #endif diff --git a/drivers/gpu/drm/amd/amdgpu/soc15.c b/drivers/gpu/drm/amd/amdgpu/soc15.c index 619933f252aa..6c0913f0d296 100644 --- a/drivers/gpu/drm/amd/amdgpu/soc15.c +++ b/drivers/gpu/drm/amd/amdgpu/soc15.c @@ -831,6 +831,8 @@ static bool soc15_need_reset_on_init(struct amdgpu_device *adev) if (adev->asic_type == CHIP_RENOIR) return true; + if (amdgpu_gmc_need_reset_on_init(adev)) + return true; if (amdgpu_psp_tos_reload_needed(adev)) return true; /* Just return false for soc15 GPUs. Reset does not seem to -- 2.25.1
[PATCH v2 4/7] drm/amdgpu: Add sysfs interfaces for NPS mode
Add a sysfs interface to see available NPS modes to switch to - cat /sys/bus/pci/devices/../available_memory_paritition Make the current_memory_partition sysfs node read/write for requesting a new NPS mode. The request is only cached and at a later point a driver unload/reload is required to switch to the new NPS mode. Ex: echo NPS1 > /sys/bus/pci/devices/../current_memory_paritition echo NPS4 > /sys/bus/pci/devices/../current_memory_paritition The above interfaces will be available only if the SOC supports more than one NPS mode. Also modify the current memory partition sysfs logic to be more generic. Signed-off-by: Lijo Lazar Reviewed-by: Rajneesh Bhardwaj --- drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 114 drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h | 6 ++ 2 files changed, 104 insertions(+), 16 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c index 758fda4e628f..24a1f931d9ed 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c @@ -1130,6 +1130,79 @@ int amdgpu_gmc_vram_checking(struct amdgpu_device *adev) return ret; } +static const char *nps_desc[] = { + [AMDGPU_NPS1_PARTITION_MODE] = "NPS1", + [AMDGPU_NPS2_PARTITION_MODE] = "NPS2", + [AMDGPU_NPS3_PARTITION_MODE] = "NPS3", + [AMDGPU_NPS4_PARTITION_MODE] = "NPS4", + [AMDGPU_NPS6_PARTITION_MODE] = "NPS6", + [AMDGPU_NPS8_PARTITION_MODE] = "NPS8", +}; + +static ssize_t available_memory_partition_show(struct device *dev, + struct device_attribute *addr, + char *buf) +{ + struct drm_device *ddev = dev_get_drvdata(dev); + struct amdgpu_device *adev = drm_to_adev(ddev); + int size = 0, mode; + char *sep = ""; + + for_each_inst(mode, adev->gmc.supported_nps_modes) { + size += sysfs_emit_at(buf, size, "%s%s", sep, nps_desc[mode]); + sep = ", "; + } + size += sysfs_emit_at(buf, size, "\n"); + + return size; +} + +static ssize_t current_memory_partition_store(struct device *dev, + struct device_attribute *attr, + const char *buf, size_t count) +{ + struct drm_device *ddev = dev_get_drvdata(dev); + struct amdgpu_device *adev = drm_to_adev(ddev); + enum amdgpu_memory_partition mode; + struct amdgpu_hive_info *hive; + int i; + + mode = UNKNOWN_MEMORY_PARTITION_MODE; + for_each_inst(i, adev->gmc.supported_nps_modes) { + if (!strncasecmp(nps_desc[i], buf, strlen(nps_desc[i]))) { + mode = i; + break; + } + } + + if (mode == UNKNOWN_MEMORY_PARTITION_MODE) + return -EINVAL; + + if (mode == adev->gmc.gmc_funcs->query_mem_partition_mode(adev)) { + dev_info( + adev->dev, + "requested NPS mode is same as current NPS mode, skipping\n"); + return count; + } + + /* If device is part of hive, all devices in the hive should request the +* same mode. Hence store the requested mode in hive. +*/ + hive = amdgpu_get_xgmi_hive(adev); + if (hive) { + atomic_set(&hive->requested_nps_mode, mode); + amdgpu_put_xgmi_hive(hive); + } else { + adev->gmc.requested_nps_mode = mode; + } + + dev_info( + adev->dev, + "NPS mode change requested, please remove and reload the driver\n"); + + return count; +} + static ssize_t current_memory_partition_show( struct device *dev, struct device_attribute *addr, char *buf) { @@ -1138,38 +1211,47 @@ static ssize_t current_memory_partition_show( enum amdgpu_memory_partition mode; mode = adev->gmc.gmc_funcs->query_mem_partition_mode(adev); - switch (mode) { - case AMDGPU_NPS1_PARTITION_MODE: - return sysfs_emit(buf, "NPS1\n"); - case AMDGPU_NPS2_PARTITION_MODE: - return sysfs_emit(buf, "NPS2\n"); - case AMDGPU_NPS3_PARTITION_MODE: - return sysfs_emit(buf, "NPS3\n"); - case AMDGPU_NPS4_PARTITION_MODE: - return sysfs_emit(buf, "NPS4\n"); - case AMDGPU_NPS6_PARTITION_MODE: - return sysfs_emit(buf, "NPS6\n"); - case AMDGPU_NPS8_PARTITION_MODE: - return sysfs_emit(buf, "NPS8\n"); - default: + if ((mode > ARRAY_SIZE(nps_desc)) || + (BIT(mode) & AMDGPU_ALL_NPS_MASK) != BIT(mode)) return sysfs_emit(buf, "UNKNOWN\n"); - } + + return sysfs_emit(buf, "%s\n", nps_desc[mode]); } -static DEVICE_ATTR_RO(current_memory_partition); +static DEVICE_ATTR_RW(current_memory_par
[PATCH v2 2/7] drm/amdgpu: Add PSP interface for NPS switch
Implement PSP ring command interface for memory partitioning on the fly on the supported asics. Signed-off-by: Rajneesh Bhardwaj Reviewed-by: Feifei Xu --- drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 25 + drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h | 1 + drivers/gpu/drm/amd/amdgpu/psp_gfx_if.h | 14 +++--- 3 files changed, 37 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c index 944dad9ad29f..04be0fabb4f5 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c @@ -1045,6 +1045,31 @@ static int psp_rl_load(struct amdgpu_device *adev) return ret; } +int psp_memory_partition(struct psp_context *psp, int mode) +{ + struct psp_gfx_cmd_resp *cmd; + int ret; + + if (amdgpu_sriov_vf(psp->adev)) + return 0; + + cmd = acquire_psp_cmd_buf(psp); + + cmd->cmd_id = GFX_CMD_ID_FB_NPS_MODE; + cmd->cmd.cmd_memory_part.mode = mode; + + dev_info(psp->adev->dev, +"Requesting %d memory partition change through PSP", mode); + ret = psp_cmd_submit_buf(psp, NULL, cmd, psp->fence_buf_mc_addr); + if (ret) + dev_err(psp->adev->dev, + "PSP request failed to change to NPS%d mode\n", mode); + + release_psp_cmd_buf(psp); + + return ret; +} + int psp_spatial_partition(struct psp_context *psp, int mode) { struct psp_gfx_cmd_resp *cmd; diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h index 76fa18ffc045..567cb1f924ca 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h @@ -553,6 +553,7 @@ int psp_load_fw_list(struct psp_context *psp, void psp_copy_fw(struct psp_context *psp, uint8_t *start_addr, uint32_t bin_size); int psp_spatial_partition(struct psp_context *psp, int mode); +int psp_memory_partition(struct psp_context *psp, int mode); int is_psp_fw_valid(struct psp_bin_desc bin); diff --git a/drivers/gpu/drm/amd/amdgpu/psp_gfx_if.h b/drivers/gpu/drm/amd/amdgpu/psp_gfx_if.h index 604301371e4f..f4a91b126c73 100644 --- a/drivers/gpu/drm/amd/amdgpu/psp_gfx_if.h +++ b/drivers/gpu/drm/amd/amdgpu/psp_gfx_if.h @@ -103,8 +103,10 @@ enum psp_gfx_cmd_id GFX_CMD_ID_AUTOLOAD_RLC = 0x0021, /* Indicates all graphics fw loaded, start RLC autoload */ GFX_CMD_ID_BOOT_CFG = 0x0022, /* Boot Config */ GFX_CMD_ID_SRIOV_SPATIAL_PART = 0x0027, /* Configure spatial partitioning mode */ - /*IDs of performance monitoring/profiling*/ - GFX_CMD_ID_CONFIG_SQ_PERFMON = 0x0046, /* Config CGTT_SQ_CLK_CTRL */ +/*IDs of performance monitoring/profiling*/ +GFX_CMD_ID_CONFIG_SQ_PERFMON = 0x0046, /* Config CGTT_SQ_CLK_CTRL */ +/* Dynamic memory partitioninig (NPS mode change)*/ +GFX_CMD_ID_FB_NPS_MODE = 0x0048, /* Configure memory partitioning mode */ }; /* PSP boot config sub-commands */ @@ -362,6 +364,11 @@ struct psp_gfx_cmd_config_sq_perfmon { uint8_t reserved[5]; }; +struct psp_gfx_cmd_fb_memory_part { + uint32_t mode; /* requested NPS mode */ + uint32_t resvd; +}; + /* All GFX ring buffer commands. */ union psp_gfx_commands { @@ -376,7 +383,8 @@ union psp_gfx_commands struct psp_gfx_cmd_load_toc cmd_load_toc; struct psp_gfx_cmd_boot_cfg boot_cfg; struct psp_gfx_cmd_sriov_spatial_part cmd_spatial_part; - struct psp_gfx_cmd_config_sq_perfmon config_sq_perfmon; +struct psp_gfx_cmd_config_sq_perfmon config_sq_perfmon; +struct psp_gfx_cmd_fb_memory_part cmd_memory_part; }; struct psp_gfx_uresp_reserved -- 2.25.1
[PATCH v2 0/7] Add support for dynamic NPS switch
This series adds supports for dynamic NPS switch on GC v9.4.3/9.4.4 SOC variants. In order to do dynamic NPS switch a sysfs interface is provided to request a new NPS mode. If the device is part of a hive, all hive devices are required to be in the same NPS mode. Hence a hive device request is saved in a hive variable. For individual device, it's saved in a gmc block variable. In order to do a NPS mode switch, the workflow is - 1) User places a requests through sysfs node. 2) User unloads the driver 3) During unload, driver checks for any pending NPS switch request. If any request is pending, it places the request to PSP FW. 4) For a hive, request is placed in one-go for all devices in the hive. If one of the requests fails, a request is placed again to revert to current NPS mode on the successful devices. 5) User reloads the driver. 6) On reload, driver checks if NPS switch is pending and initiates a mode-1 reset. 7) During resume after a reset, NPS ranges are read again from discovery table. 8) Driver detects the new NPS mode and makes a compatible compute partition mode switch if required. v2: Move NPS request check ahead of TOS reload requirement check (Feifei) Lijo Lazar (7): drm/amdgpu: Add option to refresh NPS data drm/amdgpu: Add PSP interface for NPS switch drm/amdgpu: Add gmc interface to request NPS mode drm/amdgpu: Add sysfs interfaces for NPS mode drm/amdgpu: Place NPS mode request on unload drm/amdgpu: Check gmc requirement for reset on init drm/amdgpu: Add NPS switch support for GC 9.4.3 drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c | 68 +-- drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.h | 2 +- drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 1 + drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 190 -- drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h | 19 ++ drivers/gpu/drm/amd/amdgpu/amdgpu_nbio.h | 1 + drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 25 +++ drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h | 1 + drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c | 39 drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h | 5 + drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 44 drivers/gpu/drm/amd/amdgpu/nbio_v7_9.c| 12 ++ drivers/gpu/drm/amd/amdgpu/psp_gfx_if.h | 14 +- drivers/gpu/drm/amd/amdgpu/soc15.c| 2 + 14 files changed, 387 insertions(+), 36 deletions(-) -- 2.25.1
[PATCH v2 1/7] drm/amdgpu: Add option to refresh NPS data
In certain use cases, NPS data needs to be refreshed again from discovery table. Add API parameter to refresh NPS data from discovery table. Signed-off-by: Lijo Lazar Reviewed-by: Rajneesh Bhardwaj --- drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c | 68 +++ drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.h | 2 +- drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 2 +- 3 files changed, 55 insertions(+), 17 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c index 4bd61c169ca8..9f9a1867da72 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c @@ -1723,37 +1723,75 @@ union nps_info { struct nps_info_v1_0 v1; }; +static int amdgpu_discovery_refresh_nps_info(struct amdgpu_device *adev, +union nps_info *nps_data) +{ + uint64_t vram_size, pos, offset; + struct nps_info_header *nhdr; + struct binary_header bhdr; + uint16_t checksum; + + vram_size = (uint64_t)RREG32(mmRCC_CONFIG_MEMSIZE) << 20; + pos = vram_size - DISCOVERY_TMR_OFFSET; + amdgpu_device_vram_access(adev, pos, &bhdr, sizeof(bhdr), false); + + offset = le16_to_cpu(bhdr.table_list[NPS_INFO].offset); + checksum = le16_to_cpu(bhdr.table_list[NPS_INFO].checksum); + + amdgpu_device_vram_access(adev, (pos + offset), nps_data, + sizeof(*nps_data), false); + + nhdr = (struct nps_info_header *)(nps_data); + if (!amdgpu_discovery_verify_checksum((uint8_t *)nps_data, + le32_to_cpu(nhdr->size_bytes), + checksum)) { + dev_err(adev->dev, "nps data refresh, checksum mismatch\n"); + return -EINVAL; + } + + return 0; +} + int amdgpu_discovery_get_nps_info(struct amdgpu_device *adev, uint32_t *nps_type, struct amdgpu_gmc_memrange **ranges, - int *range_cnt) + int *range_cnt, bool refresh) { struct amdgpu_gmc_memrange *mem_ranges; struct binary_header *bhdr; union nps_info *nps_info; + union nps_info nps_data; u16 offset; - int i; + int i, r; if (!nps_type || !range_cnt || !ranges) return -EINVAL; - if (!adev->mman.discovery_bin) { - dev_err(adev->dev, - "fetch mem range failed, ip discovery uninitialized\n"); - return -EINVAL; - } + if (refresh) { + r = amdgpu_discovery_refresh_nps_info(adev, &nps_data); + if (r) + return r; + nps_info = &nps_data; + } else { + if (!adev->mman.discovery_bin) { + dev_err(adev->dev, + "fetch mem range failed, ip discovery uninitialized\n"); + return -EINVAL; + } - bhdr = (struct binary_header *)adev->mman.discovery_bin; - offset = le16_to_cpu(bhdr->table_list[NPS_INFO].offset); + bhdr = (struct binary_header *)adev->mman.discovery_bin; + offset = le16_to_cpu(bhdr->table_list[NPS_INFO].offset); - if (!offset) - return -ENOENT; + if (!offset) + return -ENOENT; - /* If verification fails, return as if NPS table doesn't exist */ - if (amdgpu_discovery_verify_npsinfo(adev, bhdr)) - return -ENOENT; + /* If verification fails, return as if NPS table doesn't exist */ + if (amdgpu_discovery_verify_npsinfo(adev, bhdr)) + return -ENOENT; - nps_info = (union nps_info *)(adev->mman.discovery_bin + offset); + nps_info = + (union nps_info *)(adev->mman.discovery_bin + offset); + } switch (le16_to_cpu(nps_info->v1.header.version_major)) { case 1: diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.h index f5d36525ec3e..b44d56465c5b 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.h @@ -33,6 +33,6 @@ int amdgpu_discovery_set_ip_blocks(struct amdgpu_device *adev); int amdgpu_discovery_get_nps_info(struct amdgpu_device *adev, uint32_t *nps_type, struct amdgpu_gmc_memrange **ranges, - int *range_cnt); + int *range_cnt, bool refresh); #endif /* __AMDGPU_DISCOVERY__ */ diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c index 17a19d49d30a..4f088a5368d8 100644 -
[PATCH v2 7/7] drm/amdgpu: Add NPS switch support for GC 9.4.3
Add dynamic NPS switch support for GC 9.4.3 variants. Only GC v9.4.3 and GC v9.4.4 currently support this. NPS switch is only supported if an SOC supports multiple NPS modes. Signed-off-by: Lijo Lazar Signed-off-by: Rajneesh Bhardwaj Reviewed-by: Feifei Xu --- drivers/gpu/drm/amd/amdgpu/amdgpu_nbio.h | 1 + drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c| 44 drivers/gpu/drm/amd/amdgpu/nbio_v7_9.c | 12 +++ 3 files changed, 57 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_nbio.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_nbio.h index f61d117b0caf..79c2f807b9fe 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_nbio.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_nbio.h @@ -101,6 +101,7 @@ struct amdgpu_nbio_funcs { int (*get_compute_partition_mode)(struct amdgpu_device *adev); u32 (*get_memory_partition_mode)(struct amdgpu_device *adev, u32 *supp_modes); + bool (*is_nps_switch_requested)(struct amdgpu_device *adev); u64 (*get_pcie_replay_count)(struct amdgpu_device *adev); void (*set_reg_remap)(struct amdgpu_device *adev); }; diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c index cafcb24449df..6a95402985ef 100644 --- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c @@ -1395,6 +1395,17 @@ gmc_v9_0_query_memory_partition(struct amdgpu_device *adev) return gmc_v9_0_get_memory_partition(adev, NULL); } +static bool gmc_v9_0_need_reset_on_init(struct amdgpu_device *adev) +{ + if (adev->nbio.funcs && + adev->nbio.funcs->is_nps_switch_requested(adev)) { + adev->gmc.reset_flags |= AMDGPU_GMC_INIT_RESET_NPS; + return true; + } + + return false; +} + static const struct amdgpu_gmc_funcs gmc_v9_0_gmc_funcs = { .flush_gpu_tlb = gmc_v9_0_flush_gpu_tlb, .flush_gpu_tlb_pasid = gmc_v9_0_flush_gpu_tlb_pasid, @@ -1406,6 +1417,8 @@ static const struct amdgpu_gmc_funcs gmc_v9_0_gmc_funcs = { .override_vm_pte_flags = gmc_v9_0_override_vm_pte_flags, .get_vbios_fb_size = gmc_v9_0_get_vbios_fb_size, .query_mem_partition_mode = &gmc_v9_0_query_memory_partition, + .request_mem_partition_mode = &amdgpu_gmc_request_memory_partition, + .need_reset_on_init = &gmc_v9_0_need_reset_on_init, }; static void gmc_v9_0_set_gmc_funcs(struct amdgpu_device *adev) @@ -1545,6 +1558,28 @@ static void gmc_v9_0_set_xgmi_ras_funcs(struct amdgpu_device *adev) adev->gmc.xgmi.ras = &xgmi_ras; } +static void gmc_v9_0_init_nps_details(struct amdgpu_device *adev) +{ + adev->gmc.supported_nps_modes = 0; + + if (amdgpu_sriov_vf(adev) || (adev->flags & AMD_IS_APU)) + return; + + /*TODO: Check PSP version also which supports NPS switch. Otherwise keep +* supported modes as 0. +*/ + switch (amdgpu_ip_version(adev, GC_HWIP, 0)) { + case IP_VERSION(9, 4, 3): + case IP_VERSION(9, 4, 4): + adev->gmc.supported_nps_modes = + BIT(AMDGPU_NPS1_PARTITION_MODE) | + BIT(AMDGPU_NPS4_PARTITION_MODE); + break; + default: + break; + } +} + static int gmc_v9_0_early_init(void *handle) { struct amdgpu_device *adev = (struct amdgpu_device *)handle; @@ -2165,6 +2200,7 @@ static int gmc_v9_0_sw_init(void *handle) if (r) return r; + gmc_v9_0_init_nps_details(adev); /* * number of VMs * VMID 0 is reserved for System @@ -2440,6 +2476,14 @@ static int gmc_v9_0_resume(void *handle) int r; struct amdgpu_device *adev = (struct amdgpu_device *)handle; + /* If a reset is done for NPS mode switch, read the memory range +* information again. +*/ + if (adev->gmc.reset_flags & AMDGPU_GMC_INIT_RESET_NPS) { + gmc_v9_0_init_sw_mem_ranges(adev, adev->gmc.mem_partitions); + adev->gmc.reset_flags &= ~AMDGPU_GMC_INIT_RESET_NPS; + } + r = gmc_v9_0_hw_init(adev); if (r) return r; diff --git a/drivers/gpu/drm/amd/amdgpu/nbio_v7_9.c b/drivers/gpu/drm/amd/amdgpu/nbio_v7_9.c index d1bd79bbae53..8a0a63ac88d2 100644 --- a/drivers/gpu/drm/amd/amdgpu/nbio_v7_9.c +++ b/drivers/gpu/drm/amd/amdgpu/nbio_v7_9.c @@ -401,6 +401,17 @@ static int nbio_v7_9_get_compute_partition_mode(struct amdgpu_device *adev) return px; } +static bool nbio_v7_9_is_nps_switch_requested(struct amdgpu_device *adev) +{ + u32 tmp; + + tmp = RREG32_SOC15(NBIO, 0, regBIF_BX_PF0_PARTITION_MEM_STATUS); + tmp = REG_GET_FIELD(tmp, BIF_BX_PF0_PARTITION_MEM_STATUS, + CHANGE_STATUE); + + /* 0x8 - NPS switch requested */ + return (tmp == 0x8); +} static u32 nbio_v7_9_get_memory_partition_mode(struct amdgpu_device *adev
[PATCH v2 3/7] drm/amdgpu: Add gmc interface to request NPS mode
Add a common interface in GMC to request NPS mode through PSP. Also add a variable in hive and gmc control to track the last requested mode. Signed-off-by: Rajneesh Bhardwaj Signed-off-by: Lijo Lazar Reviewed-by: Feifei Xu --- drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 16 drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h | 6 ++ drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c | 1 + drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h | 1 + 4 files changed, 24 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c index 4f088a5368d8..758fda4e628f 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c @@ -1247,3 +1247,19 @@ int amdgpu_gmc_get_nps_memranges(struct amdgpu_device *adev, return ret; } + +int amdgpu_gmc_request_memory_partition(struct amdgpu_device *adev, + int nps_mode) +{ + /* Not supported on VF devices and APUs */ + if (amdgpu_sriov_vf(adev) || (adev->flags & AMD_IS_APU)) + return -EOPNOTSUPP; + + if (!adev->psp.funcs) { + dev_err(adev->dev, + "PSP interface not available for nps mode change request"); + return -EINVAL; + } + + return psp_memory_partition(&adev->psp, nps_mode); +} diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h index 33b2adffd58b..f5be5112b742 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h @@ -161,6 +161,9 @@ struct amdgpu_gmc_funcs { enum amdgpu_memory_partition (*query_mem_partition_mode)( struct amdgpu_device *adev); + /* Request NPS mode */ + int (*request_mem_partition_mode)(struct amdgpu_device *adev, + int nps_mode); }; struct amdgpu_xgmi_ras { @@ -304,6 +307,7 @@ struct amdgpu_gmc { struct amdgpu_mem_partition_info *mem_partitions; uint8_t num_mem_partitions; const struct amdgpu_gmc_funcs *gmc_funcs; + enum amdgpu_memory_partitionrequested_nps_mode; struct amdgpu_xgmi xgmi; struct amdgpu_irq_src ecc_irq; @@ -455,4 +459,6 @@ int amdgpu_gmc_get_nps_memranges(struct amdgpu_device *adev, struct amdgpu_mem_partition_info *mem_ranges, int exp_ranges); +int amdgpu_gmc_request_memory_partition(struct amdgpu_device *adev, + int nps_mode); #endif diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c index b17e63c98a99..5d721ccb9dfd 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c @@ -667,6 +667,7 @@ struct amdgpu_hive_info *amdgpu_get_xgmi_hive(struct amdgpu_device *adev) task_barrier_init(&hive->tb); hive->pstate = AMDGPU_XGMI_PSTATE_UNKNOWN; hive->hi_req_gpu = NULL; + atomic_set(&hive->requested_nps_mode, UNKNOWN_MEMORY_PARTITION_MODE); /* * hive pstate on boot is high in vega20 so we have to go to low diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h index d652727ca565..67abadb4f298 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h @@ -46,6 +46,7 @@ struct amdgpu_hive_info { atomic_t ras_recovery; struct ras_event_manager event_mgr; struct work_struct reset_on_init_work; + atomic_t requested_nps_mode; }; struct amdgpu_pcs_ras_field { -- 2.25.1
[PATCH v2] drm/amdgpu: simplify vram alloc logic since 2GB limitation removed
This is a followup of commit b2dba064c9bd ("drm/amdgpu: Handle sg size limit for contiguous allocation"). 2GB limitation in VRAM allocation is removed in above commit. The remaing_size calculation was to address the 2GB limitation in contiguous VRAM allocation, thus no longer needed. Simplify vram alloc logic. Signed-off-by: Yifan Zhang --- drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c | 15 +-- 1 file changed, 5 insertions(+), 10 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c index 7d26a962f811..3d129fd61fa7 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c @@ -455,7 +455,7 @@ static int amdgpu_vram_mgr_new(struct ttm_resource_manager *man, struct amdgpu_bo *bo = ttm_to_amdgpu_bo(tbo); u64 vis_usage = 0, max_bytes, min_block_size; struct amdgpu_vram_mgr_resource *vres; - u64 size, remaining_size, lpfn, fpfn; + u64 size, lpfn, fpfn; unsigned int adjust_dcc_size = 0; struct drm_buddy *mm = &mgr->mm; struct drm_buddy_block *block; @@ -516,25 +516,23 @@ static int amdgpu_vram_mgr_new(struct ttm_resource_manager *man, adev->gmc.gmc_funcs->get_dcc_alignment) adjust_dcc_size = amdgpu_gmc_get_dcc_alignment(adev); - remaining_size = (u64)vres->base.size; + size = (u64)vres->base.size; if (bo->flags & AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS && adjust_dcc_size) { unsigned int dcc_size; dcc_size = roundup_pow_of_two(vres->base.size + adjust_dcc_size); - remaining_size = (u64)dcc_size; + size = (u64)dcc_size; vres->flags |= DRM_BUDDY_TRIM_DISABLE; } mutex_lock(&mgr->lock); - while (remaining_size) { + while (true) { if (tbo->page_alignment) min_block_size = (u64)tbo->page_alignment << PAGE_SHIFT; else min_block_size = mgr->default_page_size; - size = remaining_size; - if (bo->flags & AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS && adjust_dcc_size) min_block_size = size; else if ((size >= (u64)pages_per_block << PAGE_SHIFT) && @@ -562,10 +560,7 @@ static int amdgpu_vram_mgr_new(struct ttm_resource_manager *man, if (unlikely(r)) goto error_free_blocks; - if (size > remaining_size) - remaining_size = 0; - else - remaining_size -= size; + break; } mutex_unlock(&mgr->lock); -- 2.43.0