Am 20.02.25 um 17:27 schrieb André Almeida:
> Instead of only triggering a wedged event for complete GPU resets,
> trigger for all types, like soft resets and ring resets. Regardless of
> the reset, it's useful for userspace to know that it happened because
> the kernel will reject further submissi
[AMD Official Use Only - AMD Internal Distribution Only]
Hi Jesse,
> -Original Message-
> From: Zhang, Jesse(Jie)
> Sent: Friday, February 21, 2025 2:50 PM
> To: Huang, Tim ; amd-gfx@lists.freedesktop.org
> Cc: Deucher, Alexander ; Kim, Jonathan
> ; Zhu, Jiadong ; Prosyak,
> Vitaly
> Su
[AMD Official Use Only - AMD Internal Distribution Only]
-Original Message-
From: Huang, Tim
Sent: Friday, February 21, 2025 2:46 PM
To: Zhang, Jesse(Jie) ; amd-gfx@lists.freedesktop.org
Cc: Deucher, Alexander ; Kim, Jonathan
; Zhu, Jiadong ; Zhang, Jesse(Jie)
; Prosyak, Vitaly ; Zhang,
[AMD Official Use Only - AMD Internal Distribution Only]
Hi Jesse,
> -Original Message-
> From: amd-gfx On Behalf Of
> jesse.zh...@amd.com
> Sent: Friday, February 14, 2025 1:56 PM
> To: amd-gfx@lists.freedesktop.org
> Cc: Deucher, Alexander ; Kim, Jonathan
> ; Zhu, Jiadong ; Zhang,
> Je
From: "jesse.zh...@amd.com"
This patch introduces a new function to check if the SMU supports resetting the
SDMA engine.
This capability check ensures that the driver does not attempt to reset the
SDMA engine
on hardware that does not support it.
The following changes are included:
- New funct
From: "jesse.zh...@amd.com"
- Introduce a new function `sdma_v4_4_2_init_sysfs_reset_mask` to initialize
the sysfs reset mask for SDMA.
- Move the initialization of the sysfs reset mask to the `late_init` stage to
ensure that the SMU initialization
and capability setup are completed befor
save only one record to save eeprom space,and
bad_page_num = pa_rec_num + mca_rec_num*16
Signed-off-by: ganglxie
---
drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 49 +--
.../gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c| 17 +++
.../gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.h|
bad page adding can be simpler with nps info
Signed-off-by: ganglxie
---
drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 196 +---
1 file changed, 105 insertions(+), 91 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
ind
nps info saved together with bad page makes bad page parsing more efficient
Signed-off-by: ganglxie
---
drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c | 8 ++--
drivers/gpu/drm/amd/amdgpu/amdgpu_umc.h| 7 +++
2 files changed, 13 insertions(+), 2 deletions(-)
diff --git a/drivers
[Public]
-Original Message-
From: Deucher, Alexander
Sent: Thursday, February 20, 2025 10:22 PM
To: Zhang, Jesse(Jie) ; amd-gfx@lists.freedesktop.org
Cc: Kuehling, Felix ; Kim, Jonathan
; Zhu, Jiadong ; Zhang, Jesse(Jie)
; Zhang, Jesse(Jie)
Subject: RE: [PATCH V7 5/9] drm/amdgpu: Updat
[AMD Official Use Only - AMD Internal Distribution Only]
Ping..
Emily Deng
Best Wishes
>-Original Message-
>From: Emily Deng
>Sent: Thursday, February 20, 2025 2:25 PM
>To: amd-gfx@lists.freedesktop.org
>Cc: Deng, Emily
>Subject: [PATCH 3/3] drm/amdkfd: Skip update vmid in while
[Public]
> From: Alex Deucher
> Sent: Thursday, February 20, 2025 10:18 PM
> To: Liang, Prike
> Cc: amd-gfx@lists.freedesktop.org; Deucher, Alexander
> ; Koenig, Christian
> ; Lazar, Lijo
> Subject: Re: [PATCH 1/4] drm/amdgpu/gfx11: Implement the GFX11 KGQ pipe
> reset
>
> On Thu, Feb 20, 2025
Call amdgpu_amdkfd_reserve_mem_limit in svm_range_vram_node_new when
creating a new SVM BO. Call amdgpu_amdkfd_unreserve_mem_limit
in svm_range_bo_release when the SVM BO is deleted.
v2:
Refine the error handle part in svm_range_vram_node_new.
Signed-off-by: Emily Deng
---
drivers/gpu/drm/amd/a
On Wed, Feb 19, 2025 at 05:49:01PM +0800, Huacai Chen wrote:
> > Looks like that's not going to work. Without patch 7 I'm getting a
> > warning (upgraded to a build error with a pending change to upgrade
> > objtool warnings to errors):
> >
> > arch/loongarch/kernel/machine_kexec.o: error: objtool
On 2/20/2025 12:31, Pratap Nirujogi wrote:
This patch series includes two patches:
patch-1: drm/amdgpu: Replace DRM_ERROR() with drm_err()
patch-2: drm/amdgpu: Add amdisp pinctrl MFD resource
About Patch-1: DRM_ERROR() is no longer preferred. Replace
DRM_ERROR() usage with drm_err() in isp driv
Add helper functions to handle per-instance and per-core
initialization and deinitialization in JPEG5_0_1.
Signed-off-by: Sathishkumar S
---
drivers/gpu/drm/amd/amdgpu/jpeg_v5_0_1.c | 150 +++
1 file changed, 69 insertions(+), 81 deletions(-)
diff --git a/drivers/gpu/drm/amd
Add ring reset function callback for JPEG5_0_1 to
recover from job timeouts without a full gpu reset.
Signed-off-by: Sathishkumar S
---
drivers/gpu/drm/amd/amdgpu/jpeg_v5_0_1.c | 50
1 file changed, 50 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/jpeg_v5_0_1.c
Add core reset control register definitions and align
all prior register definitions to end at 100 column
length for uniformity.
Signed-off-by: Sathishkumar S
---
drivers/gpu/drm/amd/amdgpu/jpeg_v5_0_1.h | 128 ---
1 file changed, 68 insertions(+), 60 deletions(-)
diff --git
DRM_ERROR() is no longer preferred. Replace DRM_ERROR() usage
with drm_err() in isp driver.
Signed-off-by: Pratap Nirujogi
---
drivers/gpu/drm/amd/amdgpu/isp_v4_1_0.c | 15 ++-
drivers/gpu/drm/amd/amdgpu/isp_v4_1_1.c | 15 ++-
2 files changed, 20 insertions(+), 10 deletio
This patch series includes two patches:
patch-1: drm/amdgpu: Replace DRM_ERROR() with drm_err()
patch-2: drm/amdgpu: Add amdisp pinctrl MFD resource
About Patch-1: DRM_ERROR() is no longer preferred. Replace
DRM_ERROR() usage with drm_err() in isp driver.
About Patch-2: Sensor module power is co
From: Benjamin Chan
AMDISP GPIO control uses a dedicated pinctrl driver,
and requires MFD hotadd GPIO resources.
Co-developed-by: Pratap Nirujogi
Signed-off-by: Benjamin Chan
Signed-off-by: Pratap Nirujogi
---
drivers/gpu/drm/amd/amdgpu/amdgpu_isp.h | 1 +
drivers/gpu/drm/amd/amdgpu/isp_v4_
[Public]
> -Original Message-
> From: Lazar, Lijo
> Sent: Thursday, February 20, 2025 1:30 AM
> To: Kim, Jonathan ; Alex Deucher
>
> Cc: Deucher, Alexander ; Zhang, Jesse(Jie)
> ; amd-gfx@lists.freedesktop.org; Kuehling, Felix
> ; Zhu, Jiadong
> Subject: Re: [PATCH V7 3/9] drm/amdgpu: A
On Thu, Feb 20, 2025 at 5:08 AM Christian König
wrote:
>
> Am 20.02.25 um 06:41 schrieb Lazar, Lijo:
> > On 2/19/2025 2:35 PM, jesse.zh...@amd.com wrote:
> >> From: "jesse.zh...@amd.com"
> >>
> >> - Modify the VM invalidation engine allocation logic to handle SDMA page
> >> rings.
> >> SDMA pa
Am 12.02.25 um 15:44 schrieb Alex Deucher:
> Xinhui's email is no longer valid.
>
> Signed-off-by: Alex Deucher
Reviewed-by: Christian König
> ---
> MAINTAINERS | 1 -
> 1 file changed, 1 deletion(-)
>
> diff --git a/MAINTAINERS b/MAINTAINERS
> index c8b35ca294a02..d39b272a6a751 100644
> --- a
On 2025-02-20 7:00, Emily Deng wrote:
> For xnack is off, the application should ensure the vram not overcommit.
This is incorrect. SVM ranges in VRAM can always be evicted to system memory
even with XNACK off. During the migration the user mode queues are stopped by
the MMU notifier. We apply s
On 2025-02-20 6:59, Emily Deng wrote:
> Call amdgpu_amdkfd_reserve_mem_limit in svm_range_vram_node_new when
> creating a new SVM BO. Call amdgpu_amdkfd_unreserve_mem_limit
> in svm_range_bo_release when the SVM BO is deleted.
>
> Signed-off-by: Emily Deng
> ---
> drivers/gpu/drm/amd/amdkfd/kfd
[Public]
> -Original Message-
> From: Deucher, Alexander
> Sent: Wednesday, February 19, 2025 6:27 PM
> To: jesse.zh...@amd.com; amd-gfx@lists.freedesktop.org
> Cc: Kuehling, Felix ; Kim, Jonathan
> ; Zhu, Jiadong ; Zhang,
> Jesse(Jie) ; Zhang, Jesse(Jie)
>
> Subject: RE: [PATCH V7 5/9] d
On Thu, Feb 20, 2025 at 4:39 AM Liang, Prike wrote:
>
> [Public]
>
> The various gfx11/gfx12 systems share the same start PC value, but it seems
> better to use the specific register CP_ME_PRGRM_CNTR_START to get the start
> PC value.
Why not store the value per device? Or if it's always the s
Dropping this patch. Will work around this in the driver.
Alex
On Mon, Feb 17, 2025 at 10:48 AM Alex Deucher wrote:
>
> There was a quirk added to add a workaround for a Sapphire
> RX 5600 XT Pulse. However, the quirk only checks the vendor
> ids and not the subsystem ids. The quirk really sh
…
> ---
> drivers/gpu/drm/radeon/r420.c | 15 +++
…
How do you think about to improve your version management?
https://lore.kernel.org/all/?q=%22This+looks+like+a+new+version+of+a+previously+submitted+patch%22
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Docu
[AMD Official Use Only - AMD Internal Distribution Only]
Reviewed-by: Leo Liu
> -Original Message-
> From: Sundararaju, Sathishkumar
> Sent: February 20, 2025 8:48 AM
> To: amd-gfx@lists.freedesktop.org
> Cc: Liu, Leo ; Sundararaju, Sathishkumar
>
> Subject: [PATCH] drm/amdgpu: Do not
Update power gate setting to not poweroff UVDJ in JPEG4_0_3.
Signed-off-by: Sathishkumar S
---
drivers/gpu/drm/amd/amdgpu/jpeg_v4_0_3.c | 6 --
1 file changed, 6 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/jpeg_v4_0_3.c
b/drivers/gpu/drm/amd/amdgpu/jpeg_v4_0_3.c
index 0588bb80f41e
For xnack is off, the application should ensure the vram not overcommit.
Signed-off-by: Emily Deng
---
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_fence.c | 6 ++
1 file changed, 6 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_fence.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_
Call amdgpu_amdkfd_reserve_mem_limit in svm_range_vram_node_new when
creating a new SVM BO. Call amdgpu_amdkfd_unreserve_mem_limit
in svm_range_bo_release when the SVM BO is deleted.
Signed-off-by: Emily Deng
---
drivers/gpu/drm/amd/amdkfd/kfd_migrate.c | 13 -
drivers/gpu/drm/amd/am
Am 20.02.25 um 06:41 schrieb Lazar, Lijo:
> On 2/19/2025 2:35 PM, jesse.zh...@amd.com wrote:
>> From: "jesse.zh...@amd.com"
>>
>> - Modify the VM invalidation engine allocation logic to handle SDMA page
>> rings.
>> SDMA page rings now share the VM invalidation engine with SDMA gfx rings
>> in
On Thu, Feb 20, 2025 at 09:31:41AM +0100, Greg KH wrote:
> On Fri, Jan 24, 2025 at 11:45:14PM -0700, Jim Cromie wrote:
> > This series fixes dynamic-debug's support for DRM debug-categories.
> > Classmaps-v1 evaded full review, and got committed in 2 chunks:
> >
> > b7b4eebdba7b..6ea3bf466ac6
[Public]
The various gfx11/gfx12 systems share the same start PC value, but it seems
better to use the specific register CP_ME_PRGRM_CNTR_START to get the start PC
value.
Regards,
Prike
> -Original Message-
> From: Alex Deucher
> Sent: Thursday, February 20, 2025 3:56 AM
> To: L
Am 19.02.25 um 22:35 schrieb André Almeida:
> Instead of only triggering a wedged event for complete GPU resets,
> trigger for all types, like soft resets and ring resets. Regardless of
> the reset, it's useful for userspace to know that it happened because
> the kernel will reject further submissi
[Public]
Hi Jesse,
> -Original Message-
> From: jesse.zh...@amd.com
> Sent: Thursday, February 20, 2025 4:31 PM
> To: amd-gfx@lists.freedesktop.org
> Cc: Deucher, Alexander ; Zhu, Jiadong
> ; Huang, Tim ; Zhang,
> Jesse(Jie) ; Prosyak, Vitaly
> ; Zhang, Jesse(Jie)
> Subject: [PATCH v2 1
Instead of only triggering a wedged event for complete GPU resets,
trigger for all types, like soft resets and ring resets. Regardless of
the reset, it's useful for userspace to know that it happened because
the kernel will reject further submissions from that app.
Signed-off-by: André Almeida
--
* Alex Deucher (alexdeuc...@gmail.com) wrote:
> On Wed, Feb 19, 2025 at 2:04 PM Dr. David Alan Gilbert
> wrote:
> >
> > Hi All,
> > I think you may be misisng some wiring of nbif_v6_3_1_sriov_funcs.
> >
> > My scripts noticed 'nbif_v6_3_1_sriov_funcs' was unused;
> > It was added in:
> > Com
From: "Dr. David Alan Gilbert"
The nbif_v6_3_1_sriov_funcs instance of amdgpu_nbio_funcs was added in
commit 894c6d3522d1 ("drm/amdgpu: Add nbif v6_3_1 ip block support")
but has remained unused.
Alex has confirmed it wasn't needed.
Remove it, together with the four unused stub functions:
nbi
On Fri, Jan 24, 2025 at 11:45:14PM -0700, Jim Cromie wrote:
> This series fixes dynamic-debug's support for DRM debug-categories.
> Classmaps-v1 evaded full review, and got committed in 2 chunks:
>
> b7b4eebdba7b..6ea3bf466ac6 # core dyndbg changes
> 0406faf25fb1..ee7d633f2dfb # drm adoption
After a GPU reset happens, the driver creates a coredump file. However,
the user might not be aware of it. Log the file creation the user can
find more information about the device and add the file to bug reports.
This is similar to what the xe driver does.
Signed-off-by: André Almeida
---
drive
When a ring reset happens, the kernel log shows only "amdgpu: Starting
ring reset", but when it finishes nothing appears in the
log. Explicitly write in the log that the reset has finished correctly.
Signed-off-by: André Almeida
---
drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 1 +
1 file changed,
This series does some small improvements to GPU reset information collection.
André Almeida (3):
drm/amdgpu: Log the creation of a coredump file
drm/amdgpu: Log after a successful ring reset
drm/amdgpu: Trigger a wedged event for every type of reset
.../gpu/drm/amd/amdgpu/amdgpu_dev_coredu
Hi All,
I think you may be misisng some wiring of nbif_v6_3_1_sriov_funcs.
My scripts noticed 'nbif_v6_3_1_sriov_funcs' was unused;
It was added in:
Commit: 894c6d3522d1 ("drm/amdgpu: Add nbif v6_3_1 ip block support")
and is:
drivers/gpu/drm/amd/amdgpu/nbif_v6_3_1.c:
const struct amdgpu
In r420_cp_errata_init(), the RESYNC information is stored even
when the Scratch register is not correctly allocated.
Change the return type of r420_cp_errata_init() from void to int
to propagate errors to the caller. Add error checking after
radeon_scratch_get() to ensure RESYNC information is st
In r420_cp_errata_init(), the RESYNC information is stored even
when the Scratch register is not correctly allocated.
Change the return type of r420_cp_errata_init() from void to int
to propagate errors to the caller. Add error checking after
radeon_scratch_get() to ensure RESYNC information is st
[AMD Official Use Only - AMD Internal Distribution Only]
-Original Message-
From: Huang, Tim
Sent: Thursday, February 20, 2025 4:55 PM
To: Zhang, Jesse(Jie) ; amd-gfx@lists.freedesktop.org
Cc: Deucher, Alexander ; Zhu, Jiadong
; Zhang, Jesse(Jie) ; Kim, Jonathan
; Prosyak, Vitaly ; Zhan
[AMD Official Use Only - AMD Internal Distribution Only]
Hi Jesse,
> -Original Message-
> From: jesse.zh...@amd.com
> Sent: Thursday, February 20, 2025 4:31 PM
> To: amd-gfx@lists.freedesktop.org
> Cc: Deucher, Alexander ; Zhu, Jiadong
> ; Huang, Tim ; Zhang,
> Jesse(Jie) ; Kim, Jonathan
From: "jesse.zh...@amd.com"
- Modify the `sdma_v4_4_2_sw_init` function to conditionally enable per-queue
reset support.
- For IP versions 9.4.3 and 9.4.4, enable per-queue reset if the MEC firmware
version is at least 0xb0 and PMFW supports queue reset.
- Add a TODO comment for future support
From: "jesse.zh...@amd.com"
This patch introduces a new function to check if the SMU supports resetting the
SDMA engine.
This capability check ensures that the driver does not attempt to reset the
SDMA engine
on hardware that does not support it.
The following changes are included:
- New funct
[AMD Official Use Only - AMD Internal Distribution Only]
Please split the patch into at least three parts, one for bad page/record
number calculation based on nps, one for nps saving and the third one for code
refine of bad page add.
Please check my inline comments for other suggestions.
>
54 matches
Mail list logo