[Public]
> From: Kim, Jonathan
> Sent: Wednesday, January 15, 2025 1:14 AM
> To: Liang, Prike ; amd-gfx@lists.freedesktop.org
> Cc: Deucher, Alexander ; Kuehling, Felix
> ; Koenig, Christian
> Subject: RE: [PATCH] drm/amdgpu: validate process_context_addr for the MES
> shader debugger
>
> [Publi
[AMD Official Use Only - AMD Internal Distribution Only]
-Original Message-
From: Koenig, Christian
Sent: Wednesday, January 15, 2025 7:56 PM
To: Zhang, Jesse(Jie) ; amd-gfx@lists.freedesktop.org
Cc: Deucher, Alexander ; Huang, Tim
; Prosyak, Vitaly
Subject: Re: [PATCH] drm/amdgpu: Use
On Thu, 19 Dec 2024 21:33:37 -0700
Alex Hung wrote:
> From: Harry Wentland
>
> The BT.709 and BT.2020 OETFs are the same, the only difference
> being that the BT.2020 variant is defined with more precision
> for 10 and 12-bit per color encodings.
>
> Both are used as encoding functions for vid
Hey Harry,
Gentle ping on this one :)
Em 12/12/2024 16:19, André Almeida escreveu:
amdgpu can handle async flips on overlay planes, so allow it for atomic
async checks.
Signed-off-by: André Almeida
---
drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_plane.c | 11 +++
1 file changed
[AMD Official Use Only - AMD Internal Distribution Only]
Hi @Deucher, Alexander,
Please hold on to this series, we are currently working on a refined version,
this current series will be dropped.
Thanks & Regards
Asad
-Original Message-
From: Deucher, Alexander
Sent: Wednesday, Januar
Ping?
Alex
On Tue, Jan 14, 2025 at 3:02 PM Alex Deucher wrote:
>
> This needs to be kerneldoc formatted.
>
> Fixes: 7594874227e1 ("drm/amd/display: add CEC notifier to amdgpu driver")
> Reported-by: Stephen Rothwell
> Signed-off-by: Alex Deucher
> Cc: Kun Liu
> ---
> drivers/gpu/drm/amd/incl
[AMD Official Use Only - AMD Internal Distribution Only]
Reviewed-by: Asad Kamal
Thanks & Regards
Asad
-Original Message-
From: amd-gfx On Behalf Of Alex Deucher
Sent: Thursday, January 16, 2025 8:37 PM
To: Lazar, Lijo
Cc: amd-gfx@lists.freedesktop.org; Zhang, Hawking ;
Deucher, Alex
On Wed, Jan 15, 2025 at 4:02 PM Marco Moock wrote:
>
> Am Wed, 15 Jan 2025 15:27:00 -0500
> schrieb Alex Deucher :
>
> > On Wed, Jan 15, 2025 at 3:22 PM Marco Moock wrote:
> > >
> > > Am Wed, 15 Jan 2025 16:08:34 +0100
> > > schrieb Marco Moock :
> > >
> > > > I assume it was 6.12.6, but Debian d
[Public]
> -Original Message-
> From: Liang, Prike
> Sent: Thursday, January 16, 2025 4:16 AM
> To: Kim, Jonathan ; amd-gfx@lists.freedesktop.org; Xiao,
> Jack
> Cc: Deucher, Alexander ; Kuehling, Felix
> ; Koenig, Christian
> Subject: RE: [PATCH] drm/amdgpu: validate process_context_ad
Am 16.01.2025 um 11:21:11 Uhr schrieb Alex Deucher:
> On Wed, Jan 15, 2025 at 4:02 PM Marco Moock wrote:
> >
> > Am Wed, 15 Jan 2025 15:27:00 -0500
> > schrieb Alex Deucher :
> >
> > > On Wed, Jan 15, 2025 at 3:22 PM Marco Moock
> > > wrote:
> > > >
> > > > Am Wed, 15 Jan 2025 16:08:34 +0100
On Thu, Jan 16, 2025 at 11:31 AM Marco Moock wrote:
>
> Am 16.01.2025 um 11:21:11 Uhr schrieb Alex Deucher:
>
> > On Wed, Jan 15, 2025 at 4:02 PM Marco Moock wrote:
> > >
> > > Am Wed, 15 Jan 2025 15:27:00 -0500
> > > schrieb Alex Deucher :
> > >
> > > > On Wed, Jan 15, 2025 at 3:22 PM Marco Mooc
[Public]
Asking for some advice here.
Coverity throws uninitialized errors (covered below), but at least the first
two (commented below) are explicitly set in the various function calls. Should
we be initializing them anyways, or should we only be doing that for the
variables where there's som
From: Mario Limonciello
When not set `gttsize` module parameter by default will get the
value to use for the GTT pool from the TTM page limit, which is
set by a separate module parameter.
This inevitably leads to people not sure which one to set when they
want more addressable memory for the GPU
Hi Tzung-Bi,
First of all, thanks for the patch!
On Thu, Jan 09, 2025 at 05:35:04AM +, Tzung-Bi Shih wrote:
> When compiling allmodconfig (CONFIG_WERROR=y) with clang-19, see the
> following errors:
>
> .../display/dc/dml2/display_mode_core.c:6268:13: warning: stack frame size
> (3128) exce
On Thu, Jan 16, 2025 at 8:02 AM Kamal, Asad wrote:
>
> [AMD Official Use Only - AMD Internal Distribution Only]
>
> Hi @Deucher, Alexander,
>
> Please hold on to this series, we are currently working on a refined version,
> this current series will be dropped.
Sure. thanks.
Alex
>
> Thanks &
On Thu, Jan 16, 2025 at 7:29 AM Lijo Lazar wrote:
>
> Add capability flags for SMU v13.0.6 variants. Initialize the flags
> based on firmware support. As there are multiple IP versions maintained,
> it is more manageable with one time initialization caps flags based on
> IP version and firmware fe
On Thu, Jan 16, 2025 at 9:47 AM Nathan Chancellor wrote:
>
> Hi Tzung-Bi,
>
> First of all, thanks for the patch!
>
> On Thu, Jan 09, 2025 at 05:35:04AM +, Tzung-Bi Shih wrote:
> > When compiling allmodconfig (CONFIG_WERROR=y) with clang-19, see the
> > following errors:
> >
> > .../display/dc
On 2025-01-14 14:37, Alex Deucher wrote:
> This needs to be kerneldoc formatted.
>
> Fixes: 7594874227e1 ("drm/amd/display: add CEC notifier to amdgpu driver")
> Reported-by: Stephen Rothwell
> Signed-off-by: Alex Deucher
> Cc: Kun Liu
Reviewed-by: Harry Wentland
Harry
> ---
> drivers/g
Add capability flags for SMU v13.0.6 variants. Initialize the flags
based on firmware support. As there are multiple IP versions maintained,
it is more manageable with one time initialization caps flags based on
IP version and firmware feature support.
Signed-off-by: Lijo Lazar
---
drivers/gpu/d
From: "jesse.zh...@amd.com"
This patch refactors the firmware version checks in `smu_v13_0_6_reset_sdma`
to support multiple SMU programs with different firmware version thresholds.
V2: return -EOPNOTSUPP for unspported pmfw
V3: except IP_VERSION(13, 0, 12) which is not supported.
Suggested-by:
[AMD Official Use Only - AMD Internal Distribution Only]
Please ignore this and a new version will be updated.
Thanks
Jesse
-Original Message-
From: jesse.zh...@amd.com
Sent: Friday, January 17, 2025 11:07 AM
To: amd-gfx@lists.freedesktop.org
Cc: Deucher, Alexander ; Koenig, Christian
;
From: "jesse.zh...@amd.com"
This patch refactors the firmware version checks in `smu_v13_0_6_reset_sdma`
to support multiple SMU programs with different firmware version thresholds.
V2: return -EOPNOTSUPP for unspported pmfw
V3: except IP_VERSION(13, 0, 12) which is not supported.
Suggested-by:
[AMD Official Use Only - AMD Internal Distribution Only]
You'd better re-coding this patch base on following patch from Lijo.
[PATCH] drm/amd/pm: Add capability flags for SMU v13.0.6
Best Regards,
Kevin
-Original Message-
From: amd-gfx On Behalf Of
jesse.zh...@amd.com
Sent: Friday, Ja
On 1/13/2025 7:12 AM, Jiang Liu wrote:
> Enhance amdgpu_ras_block_late_fini() to revert what has been done
> by amdgpu_ras_block_late_init(), and fix a possible resource leakage
> in function amdgpu_ras_block_late_init().
>
> Signed-off-by: Jiang Liu
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu_
[AMD Official Use Only - AMD Internal Distribution Only]
-Original Message-
From: Lin.Cao
Sent: Tuesday, January 14, 2025 6:06 PM
To: amd-gfx@lists.freedesktop.org
Cc: Koenig, Christian ; Deucher, Alexander
; cao, lin
Subject: [PATCH] drm/amdgpu: fix ring timeout issue in gfx10 sr-iov e
On 1/13/2025 7:12 AM, Jiang Liu wrote:
> Enhance amdgpu_ras_pre_fini() to better support suspend/resume by:
> 1) fix possible resource leakage. amdgpu_release_ras_context() only
>kfree(con) but doesn't release resources associated with the con
>object.
> 2) call amdgpu_ras_pre_fini() in
On 1/13/2025 7:12 AM, Jiang Liu wrote:
> Add a flag to track ras debugfs creation status, to avoid possible
> incorrect reference count management for ras block object in function
> amdgpu_ras_aca_is_supported().
>
> Signed-off-by: Jiang Liu
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu.h |
On 1/13/2025 7:12 AM, Jiang Liu wrote:
> Free all allocated resources on error recovery path in function
> amdgpu_ras_init().
>
> Signed-off-by: Jiang Liu
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 19 ++-
> 1 file changed, 14 insertions(+), 5 deletions(-)
>
> diff --g
If some GPU device failed to probe, `rmmod amdgpu` will trigger a use
after free bug related to amdgpu_driver_release_kms() as:
[16002.085540] BUG: kernel NULL pointer dereference, address:
[16002.093792] #PF: supervisor read access in kernel mode
[16002.03] #PF: error_code(0x0
This patchset tries to fix several memory leakages/invalid memory
accesses on error handling path during GPU driver loading/unloading.
They applies to:
https://gitlab.freedesktop.org/agd5f/linux.git amd-staging-drm-next
v5:
1) drop first in v4, we have found a reliable way to fix the issue.
2) add
Introduce amdgpu_device_fini_schedulers() to clean scheduler related
resources, and avoid possible invalid memory access.
Signed-off-by: Jiang Liu
---
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 35 +++---
drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c | 9 --
2 files changed,
When AMD gpu firmware files are missing, loading the amdgpu driver will
cause following invalid memory access:
[ 89.735573] amdgpu :0a:00.0: amdgpu: Fetched VBIOS from platform
[ 89.735583] amdgpu: ATOM BIOS: 113-M3080202-101
[ 89.735676] amdgpu :0a:00.0: Direct firmware load for
am
Introduce new interface amdgpu_xcp_drm_dev_free() to free a specific
drm_device crreated by amdgpu_xcp_drm_dev_alloc(), which will be used
to do error recovery.
Signed-off-by: Jiang Liu
---
drivers/gpu/drm/amd/amdxcp/amdgpu_xcp_drv.c | 63 +
drivers/gpu/drm/amd/amdxcp/amdgpu_
Enhance error handling in function amdgpu_pci_probe() to avoid
possible resource leakage.
Signed-off-by: Jiang Liu
Reviewed-by: Mario Limonciello
---
drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 12 +---
1 file changed, 9 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdg
On 1/13/2025 7:12 AM, Jiang Liu wrote:
> Currently we track the refcount on ras block object for features by
> checking `if (obj && amdgpu_ras_is_feature_enabled(adev, head))`,
> which is a little unreliable. So introduce a dedicated flag to track
> the reference count.
>
Please clarify more o
On 1/13/2025 7:12 AM, Jiang Liu wrote:
> Add helper functions to track status for ras manager and ip blocks.
>
> Signed-off-by: Jiang Liu
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu.h | 38 +
> drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 37
>
On 1/12/2025 19:42, Jiang Liu wrote:
Enhance amdgpu_ras_pre_fini() to better support suspend/resume by:
1) fix possible resource leakage. amdgpu_release_ras_context() only
kfree(con) but doesn't release resources associated with the con
object.
2) call amdgpu_ras_pre_fini() in amdgpu_devi
On 1/12/2025 19:42, Jiang Liu wrote:
Rename amdgpu_ras_pre_fini() to amdgpu_ras_early_fini(), to keep same
style with other code.
Besides amdgpu_ras_pre_fini() -> amdgpu_ras_early_fini() you also
changed amdgpu_ras_block_late_fini() -> amdgpu_ras_early_fini().
Is that really intended? If so
On 1/12/2025 19:42, Jiang Liu wrote:
Enhance amdgpu_dm_early_fini() so it can be called in power
management operations.
Signed-off-by: Jiang Liu
---
drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/
On 1/16/2025 15:32, Alex Deucher wrote:
On Thu, Jan 16, 2025 at 1:29 PM Mario Limonciello wrote:
From: Mario Limonciello
When not set `gttsize` module parameter by default will get the
value to use for the GTT pool from the TTM page limit, which is
set by a separate module parameter.
This i
From: Mario Limonciello
When not set `gttsize` module parameter by default will get the
value to use for the GTT pool from the TTM page limit, which is
set by a separate module parameter.
This inevitably leads to people not sure which one to set when they
want more addressable memory for the GPU
From: Mario Limonciello
Effectively amdgpu.gttsize gets set to ~1/2 of RAM, but that's controlled
by what the TTM page limit is set to. Clarify the kdoc.
Signed-off-by: Mario Limonciello
---
drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --
[AMD Official Use Only - AMD Internal Distribution Only]
-Original Message-
From: amd-gfx On Behalf Of Jiang Liu
Sent: Monday, January 13, 2025 09:42
To: Deucher, Alexander ; Koenig, Christian
; Pan, Xinhui ;
airl...@gmail.com; sim...@ffwll.ch; Khatri, Sunil ;
Lazar, Lijo ; Zhang, Hawk
[AMD Official Use Only - AMD Internal Distribution Only]
-Original Message-
From: amd-gfx On Behalf Of Jiang Liu
Sent: Monday, January 13, 2025 09:42
To: Deucher, Alexander ; Koenig, Christian
; Pan, Xinhui ;
airl...@gmail.com; sim...@ffwll.ch; Khatri, Sunil ;
Lazar, Lijo ; Zhang, Hawk
The purpose of halt_if_hws_hang is to preserve GPU state for driver
debugging when queue preemption fails. Issuing per-queue reset may
kill wavefronts which caused the preemption failure.
Signed-off-by: Jay Cornwall
Cc: Jonathan Kim
---
drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c | 4
On 1/12/2025 19:42, Jiang Liu wrote:
Enhance amdgpu_ras_block_late_fini() to revert what has been done
by amdgpu_ras_block_late_init(), and fix a possible resource leakage
in function amdgpu_ras_block_late_init().
Signed-off-by: Jiang Liu
---
drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 16 +
[Public]
> -Original Message-
> From: Cornwall, Jay
> Sent: Thursday, January 16, 2025 3:41 PM
> To: amd-gfx@lists.freedesktop.org
> Cc: Cornwall, Jay ; Kim, Jonathan
>
> Subject: [PATCH] drm/amdkfd: Block per-queue reset when halt_if_hws_hang=1
>
> The purpose of halt_if_hws_hang is to
On 1/12/2025 19:42, Jiang Liu wrote:
Free all allocated resources on error recovery path in function
amdgpu_ras_init().
Signed-off-by: Jiang Liu
---
drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 19 ++-
1 file changed, 14 insertions(+), 5 deletions(-)
diff --git a/drivers/gpu
On Thu, Jan 16, 2025 at 5:00 PM Mario Limonciello wrote:
>
> From: Mario Limonciello
>
> When not set `gttsize` module parameter by default will get the
> value to use for the GTT pool from the TTM page limit, which is
> set by a separate module parameter.
>
> This inevitably leads to people not
On Thu, Jan 16, 2025 at 5:07 PM Mario Limonciello wrote:
>
> From: Mario Limonciello
>
> Effectively amdgpu.gttsize gets set to ~1/2 of RAM, but that's controlled
> by what the TTM page limit is set to. Clarify the kdoc.
>
> Signed-off-by: Mario Limonciello
Reviewed-by: Alex Deucher
> ---
>
On Thu, Jan 16, 2025 at 1:29 PM Mario Limonciello wrote:
>
> From: Mario Limonciello
>
> When not set `gttsize` module parameter by default will get the
> value to use for the GTT pool from the TTM page limit, which is
> set by a separate module parameter.
>
> This inevitably leads to people not
[AMD Official Use Only - AMD Internal Distribution Only]
Reviewed-by: Yang Wang
Best Regards,
Kevin
-Original Message-
From: Lazar, Lijo
Sent: Thursday, January 16, 2025 20:29
To: amd-gfx@lists.freedesktop.org
Cc: Zhang, Hawking ; Deucher, Alexander
; Kamal, Asad ; Wang,
Yang(Kevin)
On 1/15/25 01:14, Simon Ser wrote:
diff --git a/drivers/gpu/drm/drm_atomic_uapi.c
b/drivers/gpu/drm/drm_atomic_uapi.c
index a3e1fcad47ad..4744c12e429d 100644
--- a/drivers/gpu/drm/drm_atomic_uapi.c
+++ b/drivers/gpu/drm/drm_atomic_uapi.c
@@ -701,6 +701,9 @@ static int drm_atomic_color_set_dat
On Thu, Jan 16, 2025 at 4:42 PM Mario Limonciello wrote:
>
> On 1/16/2025 15:32, Alex Deucher wrote:
> > On Thu, Jan 16, 2025 at 1:29 PM Mario Limonciello
> > wrote:
> >>
> >> From: Mario Limonciello
> >>
> >> When not set `gttsize` module parameter by default will get the
> >> value to use for
54 matches
Mail list logo