Add function description of the new functions added
in amd_ip_funcs.
new functions added are:
a. dump_ip_state
b. print_ip_state
Signed-off-by: Sunil Khatri
---
drivers/gpu/drm/amd/include/amd_shared.h | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/gpu/drm/amd/include/amd_shared.
Am 26.04.24 um 02:27 schrieb Tim Huang:
Clear overflowed array index read warning by cast operation.
Signed-off-by: Tim Huang
Reviewed-by: Christian König
---
drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/a
Avoid using the negative values
for clk_idex as an index into an array pptable->DpmDescriptor.
Signed-off-by: Jesse Zhang
---
.../gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c | 25 +++
1 file changed, 20 insertions(+), 5 deletions(-)
diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11
using index i - 1U may beyond element index
for mc_data[] when i = 0.
Signed-off-by: Jesse Zhang
---
drivers/gpu/drm/amd/pm/powerplay/hwmgr/ppatomctrl.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/ppatomctrl.c
b/drivers/gpu/drm
Fix warning for using uninitialized values sclk_mask, mck_mask and soc_mask.
Signed-off-by: Jesse Zhang
---
drivers/gpu/drm/amd/pm/swsmu/smu12/renoir_ppt.c | 8 +---
1 file changed, 5 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu12/renoir_ppt.c
b/drivers/gpu
platform: Ryzen 5600U
[520277.842817]
==
[520277.842821] BUG: KFENCE: use-after-free read in amdgpu_bo_move+0x1ce/0x710
[amdgpu]
[520277.843054] Use-after-free read at 0x31f4f80d (in kfence-#198):
[520277.843057] amdgpu_bo_
Am 26.04.24 um 05:24 schrieb Ma, Jun:
On 4/25/2024 8:39 PM, Christian König wrote:
Am 25.04.24 um 12:00 schrieb Ma Jun:
Check the ring type value to fix the out-of-bounds
write warning
Signed-off-by: Ma Jun
---
drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c | 5 +
1 file changed, 5 inserti
Am 26.04.24 um 05:57 schrieb Yunxiang Li:
The retry loop for SRIOV reset have refcount and memory leak issue.
Depending on which function call fails it can potentially call
amdgpu_amdkfd_pre/post_reset different number of times and causes
kfd_locked count to be wrong. This will block all futur
[AMD Official Use Only - General]
Please ignore this patch, Thomas will submit a new patch to replace it.
Best Regards,
Kevin
-Original Message-
From: Zhou1, Tao
Sent: Friday, April 26, 2024 11:15 AM
To: Wang, Yang(Kevin) ; amd-gfx@lists.freedesktop.org
Cc: Zhang, Hawking ; Chai, Thomas
Small APUs(i.e., consumer, embedded products) usually have a small
carveout device memory which can't satisfy most compute workloads
memory allocation requirements.
We can't even run a Basic MNIST Example with a default 512MB carveout.
https://github.com/pytorch/examples/tree/main/mnist.
Though w
The default ttm_tt_pages_limit is 1/2 of system memory.
It is prone to out of memory with such a configuration.
Signed-off-by: Lang Yu
---
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 2 +-
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h | 4 ++--
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkf
[AMD Official Use Only - General]
>-Original Message-
>From: Li, Yunxiang (Teddy)
>Sent: Friday, April 26, 2024 11:58 AM
>To: amd-gfx@lists.freedesktop.org
>Cc: Deucher, Alexander ; Koenig, Christian
>; Lazar, Lijo ; Kuehling,
>Felix ; Deng, Emily ; Li,
>Yunxiang (Teddy)
>Subject: [PATCH
Clear warnings that using uninitialized value level when fails
to get the value from SMU.
Signed-off-by: Tim Huang
---
.../drm/amd/pm/powerplay/hwmgr/smu8_hwmgr.c| 18 +++---
1 file changed, 15 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu
Check the return value of smum_send_msg_to_smc, otherwise
we might use an uninitialized variable "now"
Signed-off-by: Ma Jun
---
drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu10_hwmgr.c | 8 ++--
1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr
Check the user input and phy_id value range to fix
"Using uninitialized value phy_id"
Signed-off-by: Ma Jun
---
drivers/gpu/drm/amd/amdgpu/amdgpu_securedisplay.c | 4
1 file changed, 4 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_securedisplay.c
b/drivers/gpu/drm/amd/amdgp
Initialize variable size before calling
hwmgr->hwmgr_func->iread_sensor, such as smu7_read_sensor.
Signed-off-by: Jesse Zhang
---
drivers/gpu/drm/amd/pm/powerplay/amd_powerplay.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/gpu/drm/amd/pm/powerplay/amd_powerplay.c
b/drivers/gpu/d
Check the return of function smum_send_msg_to_smc
as it may fail to initialize the variable.
Signed-off-by: Jesse Zhang
---
.../drm/amd/pm/powerplay/hwmgr/smu10_hwmgr.c | 8 +--
.../drm/amd/pm/powerplay/hwmgr/smu7_hwmgr.c | 21 ---
.../drm/amd/pm/powerplay/hwmgr/vega10_hw
Am 26.04.24 um 11:29 schrieb Tim Huang:
Clear warnings that using uninitialized value level when fails
to get the value from SMU.
Signed-off-by: Tim Huang
Maybe drop the blank line before the "if (ret)", apart from that
Reviewed-by: Christian König
---
.../drm/amd/pm/powerplay/hwmgr/sm
[Public]
Reviewed-by: Alex Deucher
From: Sunil Khatri
Sent: Friday, April 26, 2024 3:18 AM
To: Deucher, Alexander ; Koenig, Christian
Cc: amd-gfx@lists.freedesktop.org ; Khatri,
Sunil
Subject: [PATCH] drm/amdgpu: add function descripion of new functions
Add
This patch series introduces AMDGPU usermode queues for gfx workloads.
Usermode queues is a method of GPU workload submission into the graphics
hardware without any interaction with kernel/DRM schedulers. In this
method, a userspace graphics application can create its own workqueue and
submit it di
This patch adds skeleton code for amdgpu usermode queue.
It contains:
- A new files with init functions of usermode queues.
- A queue context manager in driver private data.
V1: Worked on design review comments from RFC patch series:
(https://patchwork.freedesktop.org/series/112214/)
- Alex: Keep
From: Alex Deucher
This patch intorduces new UAPI/IOCTL for usermode graphics
queue. The userspace app will fill this structure and request
the graphics driver to add a graphics work queue for it. The
output of this UAPI is a queue id.
This UAPI maps the queue into GPU, so the graphics app can s
The FW expects us to allocate at least one page as context
space to process gang, process, GDS and FW related work.
This patch creates a joint object for the same, and calculates
GPU space offsets of these spaces.
V1: Addressed review comments on RFC patch:
Alex: Make this function IP specifi
This patch introduces amdgpu_userqueue_object and its helper
functions to creates and destroy this object. The helper
functions creates/destroys a base amdgpu_bo, kmap/unmap it and
save the respective GPU and CPU addresses in the encapsulating
userqueue object.
These helpers will be used to create
A Memory queue descriptor (MQD) of a userqueue defines it in
the hw's context. As MQD format can vary between different
graphics IPs, we need gfx GEN specific handlers to create MQDs.
This patch:
- Adds a new file which will be used for MES based userqueue
functions targeting GFX and SDMA IP.
-
This patch adds:
- A new IOCTL function to create and destroy
- A new structure to keep all the user queue data in one place.
- A function to generate unique index for the queue.
V1: Worked on review comments from RFC patch series:
- Alex: Keep a list of queues, instead of single queue per proce
To support oversubscription, MES FW expects WPTR BOs to
be mapped into GART, before they are submitted to usermode
queues. This patch adds a function for the same.
V4: fix the wptr value before mapping lookup (Bas, Christian).
V5: Addressed review comments from Christian:
- Either pin object
The userspace sends us the doorbell object and the relative doobell
index in the object to be used for the usermode queue, but the FW
expects the absolute doorbell index on the PCI BAR in the MQD. This
patch adds a function to convert this relative doorbell index to
absolute doorbell index.
V5: Fi
This patch adds new functions to map/unmap a usermode queue into
the FW, using the MES ring. As soon as this mapping is done, the
queue would be considered ready to accept the workload.
V1: Addressed review comments from Alex on the RFC patch series
- Map/Unmap should be IP specific.
V2:
Current MES GFX mask prevents FW to enable oversubscription. This patch
does the following:
- Fixes the mask values and adds a description for the same.
- Removes the central mask setup and makes it IP specific, as it would
be different when the number of pipes and queues are different.
V9: intr
This patch:
- adds a kernel config option "CONFIG_DRM_AMD_USERQ_GFX"
- moves the usequeue initialization code for all IPs under
this flag
so that the userqueue works only when the config is enabled.
Cc: Alex Deucher
Cc: Christian Koenig
Signed-off-by: Shashank Sharma
---
drivers/gpu/drm/amd
This patch adds code to cleanup any leftover userqueues which
a user might have missed to destroy due to a crash or any other
programming error.
V7: Added Alex's R-B
V8: Rebase
V9: Rebase
Cc: Alex Deucher
Cc: Christian Koenig
Reviewed-by: Alex Deucher
Suggested-by: Bas Nieuwenhuizen
Signed-of
This patch does necessary modifications to enable the SDMA
usermode queues using the existing userqueue infrastructure.
V9: introduced this patch in the series
Cc: Christian König
Cc: Alex Deucher
Signed-off-by: Shashank Sharma
Signed-off-by: Arvind Yadav
Signed-off-by: Srinivasan Shanmugam
From: Arvind Yadav
This patch does the necessary changes required to
enable compute workload support using the existing
usermode queues infrastructure.
Cc: Alex Deucher
Cc: Christian Koenig
Signed-off-by: Arvind Yadav
Signed-off-by: Shashank Sharma
---
drivers/gpu/drm/amd/amdgpu/amdgpu_user
On 4/13/2024 03:51, Gergo Koteles wrote:
Hi>
ABM will reduce the backlight and compensate by adjusting brightness and
contrast of the image. It has 5 levels: 0, 1, 2, 3, 4. 0 means off. 4 means
maximum backlight reduction. IMO, 1 and 2 look okay. 3 and 4 can be quite
impactful, both to power
From: Tvrtko Ursulin
Help code readability by replacing a bunch of:
bo->tbo.base.resv == vm->root.bo->tbo.base.resv
With:
amdgpu_bo_is_vm_bo(bo, vm)
No functional changes.
Signed-off-by: Tvrtko Ursulin
---
drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c| 2 +-
drivers/gpu/drm/amd/amdgpu/amdgp
From: Tvrtko Ursulin
When commit b453e42a6e8b ("drm/amdgpu: Add new placement for preemptible
SG BOs") added a new TTM region it missed to notice the conceptual
imbalance in GART pin size accounting as done in amdgpu_bo_pin/unpin.
That imbalance leads to such objects getting accounted against th
From: Tvrtko Ursulin
As I was reading through the driver I spotted one thing which could
perhaps make it more readable (1/3), one thing which reduces some double
conversions (in principle) from TTM placement back to domain (2/3), and
also enables the last patch in the series which maybe fixes a s
From: Tvrtko Ursulin
All apart from AMDGPU_GEM_DOMAIN_GTT memory domains map 1:1 to TTM
placements. And the former be either AMDGPU_PL_PREEMPT or TTM_PL_TT,
depending on AMDGPU_GEM_CREATE_PREEMPTIBLE.
Simplify a few places in the code which convert the TTM placement into
a domain by checking aga
[Public]
> Why remove this?
Oops it's a copy-paste error from the previous revision
> Need to call amdgpu_virt_release_full_gpu(adev, true) before retry, and the
> same as below.
I thought we talked about if we call amdgpu_virt_{reset,request_full}_gpu again
we don't need to release full gpu, I
There are other reset sources that pass NULL as the job pointer, such as
amdgpu_amdkfd_reset_work. Therefore, using the job pointer to check if
the FLR comes from the host does not work.
Add a flag in reset_context to explicitly mark host triggered reset, and
set this flag when we receive host res
The retry loop for SRIOV reset have refcount and memory leak issue.
Depending on which function call fails it can potentially call
amdgpu_amdkfd_pre/post_reset different number of times and causes
kfd_locked count to be wrong. This will block all future attempts at
opening /dev/kfd. The retry loop
From: Likun Gao
memory queue descriptors for gfx12.
v2: squash in sdma updates (Alex)
Signed-off-by: Likun Gao
Reviewed-by: Hawking Zhang
Signed-off-by: Alex Deucher
---
drivers/gpu/drm/amd/include/v12_structs.h | 1188 +
1 file changed, 1188 insertions(+)
create mode 1
From: Likun Gao
Add sdma firmware struct version 3 to support
sdma v7_0 firmware.
Signed-off-by: Likun Gao
Reviewed-by: Hawking Zhang
Signed-off-by: Alex Deucher
---
drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.c | 6 ++
drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.h | 9 +
2 files changed
From: Jack Xiao
Set SDMAx_WATCHDOG_CNTL.QUEUE_HANG_COUNT registers
to improve SDMA reliability.
Signed-off-by: Jack Xiao
Reviewed-by: Hawking Zhang
Signed-off-by: Alex Deucher
---
drivers/gpu/drm/amd/amdgpu/sdma_v7_0.c | 7 +++
1 file changed, 7 insertions(+)
diff --git a/drivers/gpu/dr
From: Likun Gao
Add new members in sdma instance structure
for sdma v7_0 firmware.
Signed-off-by: Likun Gao
Reviewed-by: Hawking Zhang
Signed-off-by: Alex Deucher
---
drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.h | 4
1 file changed, 4 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/a
From: Likun Gao
v1: Add sdma v7_0 ip block support. (Likun)
v2: Move vmhub from ring_funcs to ring. (Hawking)
v3: Switch to AMDGPU_GFXHUB(0). (Hawking)
v4: Move microcode init into early_init. (Likun)
v5: Fix warnings (Alex)
v6: Squash in various fixes (Alex)
v7: Rebase (Alex)
v8: Rebase (Alex)
From: Likun Gao
Add sdma v7_0 ip block.
v2: squash in updates (Alex)
Signed-off-by: Likun Gao
Reviewed-by: Hawking Zhang
Signed-off-by: Alex Deucher
---
drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c | 5 +
1 file changed, 5 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_d
From: Likun Gao
Provide some lost ucode name shown via firmware ID.
v2: fix whitespace (Alex)
Signed-off-by: Likun Gao
Reviewed-by: Hawking Zhang
Signed-off-by: Alex Deucher
---
drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.c | 24 +++
1 file changed, 24 insertions(+)
diff --
From: Likun Gao
Add support for new SDMA firmware struct (V3) with PSP
front door load type.
Signed-off-by: Likun Gao
Reviewed-by: Hawking Zhang
Signed-off-by: Alex Deucher
---
drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 1 +
drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c | 10 ++
driver
There is a race condition when re-creating a kfd_process for a process.
This has been observed when a process under the debugger executes
exec(3). In this scenario:
- The process executes exec.
- This will eventually release the process's mm, which will cause the
kfd_process object associated
tree/branch:
https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git master
branch HEAD: bb7a2467e6beef44a80a17d45ebf2931e7631083 Add linux-next specific
files for 20240426
Error/Warning reports:
https://lore.kernel.org/oe-kbuild-all/202404262217.dt4hoodh-...@intel.com
Error
On 2024-04-26 14:55, Lancelot SIX wrote:
There is a race condition when re-creating a kfd_process for a process.
This has been observed when a process under the debugger executes
exec(3). In this scenario:
- The process executes exec.
- This will eventually release the process's mm, which wi
Hi Dave, Sima,
More new stuff for 6.10.
The following changes since commit 0208ca55aa9c9b997da1f5bc45c4e98916323f08:
Backmerge tag 'v6.9-rc5' into drm-next (2024-04-22 14:35:52 +1000)
are available in the Git repository at:
https://gitlab.freedesktop.org/agd5f/linux.git
tags/amd-drm-next-
On 2024-04-26 12:43, Tvrtko Ursulin wrote:
From: Tvrtko Ursulin
When commit b453e42a6e8b ("drm/amdgpu: Add new placement for preemptible
SG BOs") added a new TTM region it missed to notice the conceptual
imbalance in GART pin size accounting as done in amdgpu_bo_pin/unpin.
That imbalance lea
On 2024-04-26 04:37, Lang Yu wrote:
The default ttm_tt_pages_limit is 1/2 of system memory.
It is prone to out of memory with such a configuration.
Indiscriminately allowing the violation of all memory limits is not a
good solution. It will lead to poor performance once you actually reach
ttm_p
On 2024-04-26 04:37, Lang Yu wrote:
Small APUs(i.e., consumer, embedded products) usually have a small
carveout device memory which can't satisfy most compute workloads
memory allocation requirements.
We can't even run a Basic MNIST Example with a default 512MB carveout.
https://github.com/pyt
57 matches
Mail list logo