[AMD Official Use Only - AMD Internal Distribution Only]
Hi Patrick,
I know your concern, but I think the sudden power off is not an usual case in
server platform.
Regards,
Tao
> -Original Message-
> From: amd-gfx On Behalf Of Xie,
> Patrick
> Sent: Thursday, March 13, 2025 11:41 AM
Correct F8_MODE setting for gfx950 that was removed
Fixes: 1a9dbc31d234
Signed-off-by: Amber Lin
---
drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager_v9.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager_v9.c
b/drivers
[AMD Official Use Only - AMD Internal Distribution Only]
Hi, Tao:
I am worried about host reboot or power down during the eeprom
formating, which will make the bad page info lost.
If the issue needs to be considered, I suggest save bad page info on
host disk before eeprom formatt
The mistake of computation for remain size of CPER ring will cause
unbreakable while cycle when CPER ring overflow.
Signed-off-by: Xiang Liu
---
drivers/gpu/drm/amd/amdgpu/amdgpu_cper.c | 15 ---
1 file changed, 8 insertions(+), 7 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu
[AMD Official Use Only - AMD Internal Distribution Only]
It looks good to me.
Reviewed-by: Yang Wang
Best Regards,
Kevin
-Original Message-
From: amd-gfx On Behalf Of Tomasz Pakula
Sent: Wednesday, March 12, 2025 05:39
To: amd-gfx@lists.freedesktop.org
Cc: Deucher, Alexander
Subject:
[AMD Official Use Only - AMD Internal Distribution Only]
Reviewed-by: Harish Kasiviswanathan https://aka.ms/AAb9ysg>
From: Lin, Amber
Sent: Wednesday, March 12, 2025 9:39:21 PM
To: amd-gfx@lists.freedesktop.org
Cc: Kasiviswanathan, Harish ; Lin, Amber
Subject:
According to MES API spec, type 0 means invald. Pass mes type in ring
dump function to avoid confusion like below:
[0x0@0x + 0x7200] [0x000400e1]Opcode 0xe
[MES_SCH_API_MISC] (64 words, type: 0, hdr: 0x400e1)
Signed-off-by: Yifan Zhang
---
src/lib/read_mes_stream.c |
Plumb in support for disabling kernel queues in
GFX11. We have to bring up a GFX queue briefly in
order to initialize the clear state. After that
we can disable it.
v2: use ring counts per Felix' suggestion
v3: fix stream fault handler, enable EOP interrupts
Signed-off-by: Alex Deucher
---
dr
On 2025-03-09 20:51, Deng, Emily wrote:
[AMD Official Use Only - AMD Internal Distribution Only]
[AMD Official Use Only - AMD Internal Distribution Only]
*From:*Chen, Xiaogang
*Sent:* Saturday, March 8, 2025 8:38 AM
*To:* Deng, Emily ; amd-gfx@lists.freedesktop.org
*Subject:* Re: [PATCH v4
Hi Dave, Simona,
Fixes for 6.14.
The following changes since commit 80e54e84911a923c40d7bee33a34c1b4be148d7a:
Linux 6.14-rc6 (2025-03-09 13:45:25 -1000)
are available in the Git repository at:
https://gitlab.freedesktop.org/agd5f/linux.git
tags/amd-drm-fixes-6.14-2025-03-12
for you to fe
This would be set by IPs which only accept submissions
from the kernel, not userspace, such as when kernel
queues are disabled. Don't expose the rings to userspace
and reject any submissions in the CS IOCTL.
Reviewed-by: Sunil Khatri
Signed-off-by: Alex Deucher
---
drivers/gpu/drm/amd/amdgpu/amd
For SDMA, we still need kernel queues for paging so
they need to be initialized, but we no not want to
accept submissions from userspace when disable_kq
is set.
Reviewed-by: Sunil Khatri
Signed-off-by: Alex Deucher
---
drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.h | 1 +
1 file changed, 1 insertion(
If we don't have kernel queues, the vmids can be used by
the MES for user queues.
Reviewed-by: Sunil Khatri
Signed-off-by: Alex Deucher
---
drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c | 2 +-
drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c | 2 +-
drivers/gpu/drm/amd/amdgpu/gmc_v12_0.c | 2 +-
3 files cha
Make all resources available to user queues.
Suggested-by: Sunil Khatri
Reviewed-by: Sunil Khatri
Signed-off-by: Alex Deucher
---
drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
b/drivers
When the parameter is set, disable user submissions
to kernel queues.
Reviewed-by: Sunil Khatri
Signed-off-by: Alex Deucher
---
drivers/gpu/drm/amd/amdgpu/sdma_v7_0.c | 4
1 file changed, 4 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v7_0.c
b/drivers/gpu/drm/amd/amdgpu/sdm
Plumb in support for disabling kernel queues.
v2: use ring counts per Felix' suggestion
v3: fix stream fault handler, enable EOP interrupts
Signed-off-by: Alex Deucher
---
drivers/gpu/drm/amd/amdgpu/gfx_v12_0.c | 183 +
1 file changed, 125 insertions(+), 58 deletions(-)
When the parameter is set, disable user submissions
to kernel queues.
Reviewed-by: Sunil Khatri
Signed-off-by: Alex Deucher
---
drivers/gpu/drm/amd/amdgpu/sdma_v6_0.c | 4
1 file changed, 4 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v6_0.c
b/drivers/gpu/drm/amd/amdgpu/sdm
Add proper checks for disable_kq functionality in
gfx helper functions. Add special logic for families
that require the clear state setup.
v2: use ring count as per Felix suggestion
v3: fix num_gfx_rings handling in amdgpu_gfx_graphics_queue_acquire()
Signed-off-by: Alex Deucher
---
drivers/gp
On chips that support user queues, setting this option
will disable kernel queues to be used to validate
user queues without kernel queues.
Reviewed-by: Prike Liang
Reviewed-by: Sunil Khatri
Signed-off-by: Alex Deucher
---
drivers/gpu/drm/amd/amdgpu/amdgpu.h | 1 +
drivers/gpu/drm/amd/amdg
To better evaluate user queues, add a module parameter
to disable kernel queues. With this set kernel queues
are disabled and only user queues are available. This
frees up hardware resources for use in user queues which
would otherwise be used by kernel queues and provides
a way to validate user
On Tue, Mar 11, 2025 at 10:18 AM Liang, Prike wrote:
>
> [Public]
>
> > From: amd-gfx On Behalf Of Alex
> > Deucher
> > Sent: Friday, March 7, 2025 11:16 PM
> > To: amd-gfx@lists.freedesktop.org
> > Cc: Deucher, Alexander
> > Subject: [PATCH 03/11] drm/amdgpu/gfx: add generic handling for disabl
On Tue, Mar 11, 2025 at 9:13 AM Liang, Prike wrote:
>
> [Public]
>
> > From: amd-gfx On Behalf Of Alex
> > Deucher
> > Sent: Friday, March 7, 2025 2:46 AM
> > To: amd-gfx@lists.freedesktop.org
> > Cc: Deucher, Alexander
> > Subject: [PATCH 02/11] drm/amdgpu: add ring flag for no user submissions
This was leftover from MES bring up when we had MES
user queues in the kernel. It's no longer used so
remove it.
Signed-off-by: Alex Deucher
---
drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c | 4 +-
drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c | 112 ++-
drivers/gpu/drm/amd/amdgpu/amdgpu_r
On 3/12/25 02:11, Christian König wrote:
Am 11.03.25 um 18:10 schrieb Alex Hung:
This macro guard "__cplusplus" is unnecessary and should not be there.
Signed-off-by: Alex Hung
---
drivers/gpu/drm/amd/display/dc/sspl/dc_spl.h | 3 ---
1 file changed, 3 deletions(-)
diff --git a/drivers/
Am 11.03.25 um 18:10 schrieb Alex Hung:
> This macro guard "__cplusplus" is unnecessary and should not be there.
>
> Signed-off-by: Alex Hung
> ---
> drivers/gpu/drm/amd/display/dc/sspl/dc_spl.h | 3 ---
> 1 file changed, 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/display/dc/sspl/dc_spl.
On Wed, Mar 12, 2025 at 12:55 AM Feng, Kenneth wrote:
>
> [AMD Official Use Only - AMD Internal Distribution Only]
>
> Hi Alex,
> I tested this patch. After the desktop is launched, at a certain time, the
> workload is set to 3d fullscreen twice, then
> The idle worker can't set it back to bootup
On 3/7/2025 7:18 PM, Christian König wrote:
That was quite troublesome for gang submit. Completely drop this
approach and enforce the isolation separately.
Signed-off-by: Christian König
---
drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 2 +-
drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c | 9 +
On 3/7/2025 7:18 PM, Christian König wrote:
That was quite troublesome for gang submit. Completely drop this
approach and enforce the isolation separately.
Signed-off-by: Christian König
---
drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 2 +-
drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c | 9 +--
On 3/7/2025 7:18 PM, Christian König wrote:
This allows using amdgpu_sync even without peeking into the fences for a
long time.
Signed-off-by: Christian König
---
drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c | 13 +
1 file changed, 9 insertions(+), 4 deletions(-)
diff --git a/driv
+ Kenneth, Kevin
On Wed, Mar 12, 2025 at 6:18 AM Tomasz Pakuła
wrote:
>
> Currently, it seems like the code was carried over from RDNA3 because
> it assumes two possible values to set. RDNA4, instead of having:
> 0: min SCLK
> 1: max SCLK
> only has
> 0: SCLK offset
>
> This change makes it so it
[AMD Official Use Only - AMD Internal Distribution Only]
Reviewed-by: Tao Zhou
> -Original Message-
> From: Xie, Patrick
> Sent: Wednesday, March 12, 2025 2:16 PM
> To: amd-gfx@lists.freedesktop.org
> Cc: Zhou1, Tao ; Xie, Patrick
> Subject: [PATCH] drm/amdgpu: Save PA of bad pages for
On Wed, Mar 12, 2025 at 4:19 AM Lazar, Lijo wrote:
>
>
>
> On 3/11/2025 7:47 PM, Alex Deucher wrote:
> > Only increment the power profile on the first submission.
> > Since the decrement may end up being pushed out as new
> > submissions come in, we only need to increment it once.
> >
> > Fixes: 1
We need to make sure the workload profile ref counts are
balanced. This isn't currently the case because we can
increment the count on submissions, but the decrement may
be delayed as work comes in. Track when we enable the
workload profile so the references are balanced.
v2: switch to a mutex a
We need to make sure the workload profile ref counts are
balanced. This isn't currently the case because we can
increment the count on submissions, but the decrement may
be delayed as work comes in. Track when we enable the
workload profile so the references are balanced.
v2: switch to a mutex a
[AMD Official Use Only - AMD Internal Distribution Only]
From: Koenig, Christian
Sent: Wednesday, March 12, 2025 4:39 PM
To: Zhang, Jesse(Jie) ; amd-gfx@lists.freedesktop.org
Cc: Deucher, Alexander ; Kim, Jonathan
; Zhu, Jiadong
Subject: Re: [PATCH 1/7] drm/amd/amdgpu: Simplify SDMA reset mec
From: "jesse.zh...@amd.com"
This patch refactors the SDMA v5.0 queue reset and stop logic to improve
code readability, maintainability, and performance. The key changes include:
1. **Generalized `sdma_v5_0_gfx_stop` Function**:
- Added an `inst_mask` parameter to allow stopping specific SDMA
From: "jesse.zh...@amd.com"
This patch refactors the SDMA v5.2 reset logic by splitting the
`sdma_v5_2_reset_queue` function into two separate functions:
`sdma_v5_2_stop_queue` and `sdma_v5_2_restore_queue`.
This change aligns with the new SDMA reset mechanism, where the reset process
is divid
On Mon, Mar 10, 2025 at 06:03:27PM -0400, Alex Deucher wrote:
> On Mon, Mar 10, 2025 at 5:54 PM André Almeida wrote:
> >
> > Em 01/03/2025 03:04, Raag Jadav escreveu:
> > > On Fri, Feb 28, 2025 at 06:49:43PM -0300, André Almeida wrote:
> > >> Hi Raag,
> > >>
> > >> On 2/28/25 11:58, Raag Jadav wro
On Tue, Mar 11, 2025 at 07:09:45PM +0200, Raag Jadav wrote:
> On Mon, Mar 10, 2025 at 06:27:53PM -0300, André Almeida wrote:
> > Em 01/03/2025 02:53, Raag Jadav escreveu:
> > > On Fri, Feb 28, 2025 at 06:54:12PM -0300, André Almeida wrote:
> > > > Hi Raag,
> > > >
> > > > On 2/28/25 11:20, Raag Ja
From: "jesse.zh...@amd.com"
This patch removes the deprecated SDMA reset callback mechanism, which was
previously used to register pre-reset and post-reset callbacks for SDMA engine
resets.
The callback mechanism has been replaced with a more direct and efficient
approach using `stop_queue` a
From: "jesse.zh...@amd.com"
This patch introduces a new function `amdgpu_sdma_soft_reset` to handle SDMA
soft resets directly,
rather than relying on the DPM interface.
1. **New `amdgpu_sdma_soft_reset` Function**:
- Implements a soft reset for SDMA engines by directly writing to the
hardwa
Clear old data and save it in V3 format.
v2: only format eeprom data for new ASICs.
Signed-off-by: Tao Zhou
---
drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 7 +
.../gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c| 26 ++-
.../gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.h| 1 +
Currently, it seems like the code was carried over from RDNA3 because
it assumes two possible values to set. RDNA4, instead of having:
0: min SCLK
1: max SCLK
only has
0: SCLK offset
This change makes it so it only reports current offset value instead of
showing possible min/max values and their i
On Wed, Mar 12, 2025 at 09:25:08AM +0100, Christian König wrote:
>Am 11.03.25 um 18:13 schrieb Raag Jadav:
>> On Mon, Mar 10, 2025 at 06:03:27PM -0400, Alex Deucher wrote:
>>> On Mon, Mar 10, 2025 at 5:54 PM André Almeida
>>> wrote:
Em 01/03/2025 03:04, Raag Jadav escreveu:
> On Fri, Feb
On Mon, Mar 10, 2025 at 06:27:53PM -0300, André Almeida wrote:
> Em 01/03/2025 02:53, Raag Jadav escreveu:
> > On Fri, Feb 28, 2025 at 06:54:12PM -0300, André Almeida wrote:
> > > Hi Raag,
> > >
> > > On 2/28/25 11:20, Raag Jadav wrote:
> > > > Cc: Lucas
> > > >
> > > > On Fri, Feb 28, 2025 at 09
In gfx_v12_0_cp_gfx_load_me_microcode_rs64(), gfx_v12_0_pfp_fini() is
incorrectly used to free 'me' field of 'gfx', since gfx_v12_0_pfp_fini()
can only release 'pfp' field of 'gfx'. The release function of 'me' field
should be gfx_v12_0_me_fini().
Fixes: 52cb80c12e8a ("drm/amdgpu: Add gfx v12_0 ip
On Wed, Mar 12, 2025 at 03:39:41PM +0800, Chung, ChiaHsuan (Tom) wrote:
> The original code will check the drm_new_conn_state if it's valid in here
>
> 10712 if (IS_ERR(drm_new_conn_state)) {
>
That's checking for error pointers. It's irrelevant. The warning is
about NULL point
From: "jesse.zh...@amd.com"
Since KFD no longer registers its own callbacks for SDMA resets, and only KGD
uses the reset mechanism,
we can simplify the SDMA reset flow by directly calling the ring's `stop_queue`
and `start_queue` functions.
This patch removes the dynamic callback mechanism and
[AMD Official Use Only - AMD Internal Distribution Only]
-Original Message-
From: Koenig, Christian
Sent: Wednesday, March 12, 2025 4:07 PM
To: Zhang, Jesse(Jie) ; amd-gfx@lists.freedesktop.org
Cc: Deucher, Alexander ; Kim, Jonathan
; Zhu, Jiadong
Subject: Re: [PATCH 1/7] drm/amd/amdgpu
Am 12.03.25 um 09:15 schrieb Zhang, Jesse(Jie):
> [SNIP9
>> -
>> + gfx_ring->funcs->stop_queue(adev, instance_id);
> Yeah that starts to look good. Question here is who is calling
> amdgpu_sdma_reset_engine()?
>
> If this call comes from engine specific code we might not need the
> start/stop
Am 11.03.25 um 18:13 schrieb Raag Jadav:
> On Mon, Mar 10, 2025 at 06:03:27PM -0400, Alex Deucher wrote:
>> On Mon, Mar 10, 2025 at 5:54 PM André Almeida wrote:
>>> Em 01/03/2025 03:04, Raag Jadav escreveu:
On Fri, Feb 28, 2025 at 06:49:43PM -0300, André Almeida wrote:
> Hi Raag,
>
>>
On 3/11/2025 7:47 PM, Alex Deucher wrote:
> Only increment the power profile on the first submission.
> Since the decrement may end up being pushed out as new
> submissions come in, we only need to increment it once.
>
> Fixes: 1443dd3c67f6 ("drm/amd/pm: fix and simplify workload handling”)
> C
Am 12.03.25 um 08:59 schrieb jesse.zh...@amd.com:
> From: "jesse.zh...@amd.com"
>
> Since KFD no longer registers its own callbacks for SDMA resets, and only KGD
> uses the reset mechanism,
> we can simplify the SDMA reset flow by directly calling the ring's
> `stop_queue` and `start_queue` func
From: "jesse.zh...@amd.com"
This patch refactors the SDMA v5.0 reset logic by splitting the
`sdma_v5_0_reset_queue` function into two separate functions:
`sdma_v5_0_stop_queue` and `sdma_v5_0_restore_queue`.
This change aligns with the new SDMA reset mechanism, where the reset process
is divi
From: "jesse.zh...@amd.com"
This patch refactors the SDMA v5.2 queue reset and stop logic to improve
code readability, maintainability, and performance. The key changes include:
1. **Generalized `sdma_v5_2_gfx_stop` Function**:
- Added an `inst_mask` parameter to allow stopping specific
The original code will check the drm_new_conn_state if it's valid in here
10712 if (IS_ERR(drm_new_conn_state)) {
After that the drm_new_conn_state does not touch by anyone before call the
--> 10751 ret = fill_hdr_info_packet(drm_new_conn_state,
I think it shoul
56 matches
Mail list logo