On 2/14/2025 11:05 PM, Alex Deucher wrote:
Re-send the mes message on resume to make sure the
mes state is up to date.
Fixes: 8521e3c5f058 ("drm/amd/amdgpu: limit single process inside MES")
Signed-off-by: Alex Deucher
Cc: Shaoyun Liu
Cc: Srinivasan Shanmugam
---
drivers/gpu/drm/amd/amdgpu/am
[AMD Official Use Only - AMD Internal Distribution Only]
I think I should make it more clear. When mes is been used , no matter its
pipe0 or pipe1 , we expected both set_hw_resource and set_hw_resource_1 been
called, that's requirement for mes_v12 and later . For none unified mes config,
the pi
On Fri, Feb 14, 2025 at 6:38 PM Rodrigo Siqueira wrote:
>
> On 02/14, Alex Deucher wrote:
> > On Fri, Feb 14, 2025 at 6:00 PM Rodrigo Siqueira
> > wrote:
> > >
> > > Users can check the file "/sys/kernel/debug/dri/0/amdgpu_firmware_info"
> > > to get information on the firmware loaded in the sys
On 02/14, Alex Deucher wrote:
> On Fri, Feb 14, 2025 at 6:00 PM Rodrigo Siqueira wrote:
> >
> > Users can check the file "/sys/kernel/debug/dri/0/amdgpu_firmware_info"
> > to get information on the firmware loaded in the system. This file has
> > multiple acronyms that are not documented in the gl
On Fri, Feb 14, 2025 at 6:00 PM Rodrigo Siqueira wrote:
>
> Users can check the file "/sys/kernel/debug/dri/0/amdgpu_firmware_info"
> to get information on the firmware loaded in the system. This file has
> multiple acronyms that are not documented in the glossary. This commit
> introduces some mi
Users can check the file "/sys/kernel/debug/dri/0/amdgpu_firmware_info"
to get information on the firmware loaded in the system. This file has
multiple acronyms that are not documented in the glossary. This commit
introduces some missing acronyms to the AMD glossary documentation. The
meaning of ea
[AMD Official Use Only - AMD Internal Distribution Only]
Hi Robert,
We only had one enum: COLOR_SPACE_2020_YCBCR.
On the output side this assumed limited range.
On the input side this apparently assumed full range given the dpp matrix.
Now we split it into two enums to distinguish them and add li
[Public]
Hi Robert, thank you for the feedback.
What about this version of commit message:
Fix BT2020 YCbCr limited/full range input
[Why]
BT2020 YCbCr input is not handled properly when full range
quantization is used and limited range is not supported at all.
[How]
- Add enums for BT2020 YCb
[Public]
Are there any cases where the asic_type check would cause this register to fail
to get programmed?
Alex
From: amd-gfx on behalf of Victor Lu
Sent: Thursday, February 13, 2025 7:13 PM
To: amd-gfx@lists.freedesktop.org
Cc: Lu, Victor Cheng Chi (Victor
[AMD Official Use Only - AMD Internal Distribution Only]
Oh, you right. It's only for unified MES , for none-unified , it will still
use the kiq from CP directly on pipe1 . So there is no MES API for it at all .
It's my fault . please ignore my previous comments . Your current change for
this
[AMD Official Use Only - AMD Internal Distribution Only]
Does it matter which pipe we use for these packets?
Alex
From: Liu, Shaoyun
Sent: Friday, February 14, 2025 12:36 PM
To: Deucher, Alexander ;
amd-gfx@lists.freedesktop.org
Subject: RE: [PATCH 2/2] drm/am
[AMD Official Use Only - AMD Internal Distribution Only]
Ok . From MES point of view , we expecting both set_hw_resource and
set_hw_resource_1 been called all the time.
Reviewed-by: Shaoyun.liu
From: Deucher, Alexander
Sent: Friday, February 14, 2025 11:53 AM
To: Liu, Shaoyun ; amd-gfx@list
Re-send the mes message on resume to make sure the
mes state is up to date.
Fixes: 8521e3c5f058 ("drm/amd/amdgpu: limit single process inside MES")
Signed-off-by: Alex Deucher
Cc: Shaoyun Liu
Cc: Srinivasan Shanmugam
---
drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c | 13 -
drivers/gpu/d
[AMD Official Use Only - AMD Internal Distribution Only]
I can add that as a follow up patch as I don't want to change the current
behavior to avoid a potential regression. Should we submit both the resource
and resource_1 packets all the time?
Thanks,
Alex
F
On Fri, Feb 14, 2025 at 11:42 AM Liu, Shaoyun wrote:
>
> [AMD Official Use Only - AMD Internal Distribution Only]
>
> Looks good to me .
> Reviewed-by: Shaoyun.liu < shaouyun@amd.com>
Thanks, is this for the whole series or just this patch?
Alex
>
> -Original Message-
> From: amd-gf
[AMD Official Use Only - AMD Internal Distribution Only]
I'd suggest remove the enable_uni_mes check, set_hw_resource_1 is always
required for gfx12 and up. Especially after add the cleaner_shader_fence_addr
there.
Regards
Shaoyun.liu
-Original Message-
From: amd-gfx On Behalf Of A
[AMD Official Use Only - AMD Internal Distribution Only]
Looks good to me .
Reviewed-by: Shaoyun.liu < shaouyun@amd.com>
-Original Message-
From: amd-gfx On Behalf Of Alex Deucher
Sent: Friday, February 14, 2025 10:19 AM
To: amd-gfx@lists.freedesktop.org
Cc: Deucher, Alexander
Subje
> > > Fixes")
> > > Signed-off-by: Nathan Chancellor
> > > ---
> > > If you would prefer reapplying the local fix, feel free to do so, but I
> > > would like for it to be in the upstream source so it does not have to
> > > keep being applied.
&
t to be in the upstream source so it does not have to
> > keep being applied.
>
> I've reapplied the original fix and I've confirmed that the fix will
> be pushed to the DML tree as well this time.
Did that actually end up happening? Commit 1b30456150e5
("drm/amd/display: DML21 Reintegration") in next-20250214 reintroduces
this warning... I guess it may be a timing thing because the author date
is three weeks ago or so. Should I send my "Reapply" patch or will you
take care of it?
Cheers,
Nathan
[Public]
We could be talking about 2 types of bandwidth here.
1. Bandwidth per link
2. Bandwidth per peer i.e. multiple xgmi links that are used for SDMA gang
submissions for effective max bandwidth * num_link copy speed. The is
currently used by runtime i.e. max divide by min. The numb
[Public]
For minimum bandwidth, we should keep the possibility of going to FW to get the
data when XGMI DPM is in place. So it is all wrapped inside the API when the
devices passed are connected. The caller doesn't need to know.
BTW, what is the real requirement of bandwidth data without any pe
Allocate the buffer at sw init time so we don't alloc
and free it for every suspend/resume or reset cycle.
Signed-off-by: Alex Deucher
---
drivers/gpu/drm/amd/amdgpu/mes_v12_0.c | 39 +-
1 file changed, 19 insertions(+), 20 deletions(-)
diff --git a/drivers/gpu/drm/amd/a
Allocate the buffer at sw init time so we don't alloc
and free it for every suspend/resume or reset cycle.
Signed-off-by: Alex Deucher
---
drivers/gpu/drm/amd/amdgpu/mes_v11_0.c | 52 +-
1 file changed, 26 insertions(+), 26 deletions(-)
diff --git a/drivers/gpu/drm/amd/a
>> On 2/14/2025 2:39 PM, Christian König wrote:
>>> Am 14.02.25 um 09:57 schrieb Srinivasan Shanmugam:
>>> RLCG Register Access is a way for virtual functions to safely access GPU
>>> registers in a virtualized environment., including TLB flushes and
>>> register reads. When multiple threads o
From: Rodrigo Siqueira
Introduce the DCC and Tiling reset callback to all DCN versions that can
call it.
Reviewed-by: Alvin Lee
Signed-off-by: Rodrigo Siqueira
Signed-off-by: Roman Li
---
drivers/gpu/drm/amd/display/dc/core/dc_surface.c| 13 ++---
.../gpu/drm/amd/display/dc/hwss/
From: Rodrigo Siqueira
This commit introduces a function helper for resetting DCN/DCE DCC and
tiling. Those functions are generic for their respective DCN/DCE, so
they were added to the oldest version of each architecture.
Reviewed-by: Alvin Lee
Signed-off-by: Rodrigo Siqueira
Signed-off-by: R
From: Nicholas Kazlauskas
[Why]
We should never apply a minimum dispclk value while in prepare_bandwidth
or while displays are active. This is always an optimization for when
all displays are disabled.
[How]
Defer dispclk optimization until safe_to_lower = true and display_count
reaches 0.
Sinc
From: Oleh Kuzhylnyi
[Why]
The informative structure needs to be extended by the total number of DPPs
required per each active plane.
The new informative field is going to be used as a statistical indicator.
[How]
The dml2_core_calcs_get_informative() routine must count a total number of DPPs.
From: Rodrigo Siqueira
Rename dc_plane_force_update_for_panic to
dc_plane_force_dcc_and_tiling_disable to describe the function operation
in the name. Also, this function might be used in other contexts, and a
more generic name can be helpful for this purpose.
Reviewed-by: Alvin Lee
Signed-off-
From: George Shen
[Why]
The latest DP spec requires the DP TX to read DPCD Fh through F0009h
when detecting LTTPR capabilities for the first time.
[How]
Update LTTPR cap retrieval to read up to F0009h (two more bytes than the
previous F0007h), and store the LTTPR ALPM capabilities.
Reviewed
From: Taimur Hassan
Summary:
* Add support for disconnected eDP streams
* Add log for MALL entry on DCN32x
* Add DCC/Tiling reset helper for DCN and DCE
* Guard against setting dispclk low when active
* Other minor fixes
Reviewed-by: Aurabindo Pillai
Signed-off-by: Taimur Hassan
Signed-off-by
From: Harry VanZyllDeJong
[Why]
eDP may not be connected to the GPU on driver start causing
fail enumeration.
[How]
Move the virtual signal type check before the eDP connector
signal check.
Reviewed-by: Wenjing Liu
Signed-off-by: Harry VanZyllDeJong
Signed-off-by: Roman Li
---
.../drm/amd/d
From: Aurabindo Pillai
[Why&How]
Add a dyndbg log entry to check whether the driver requested scanout
from MALL cache to PMFW via DMCUB
Reviewed-by: Zaeem Mohamed
Reviewed-by: Roman Li
Signed-off-by: Aurabindo Pillai
---
drivers/gpu/drm/amd/display/dc/hwss/dcn32/dcn32_hwseq.c | 2 ++
1 file
From: Peichen Huang
[WHY]
In current HPO DP2 implementation, driver would enable/disable DIG
encoder when configuring HPO DP2. Therefore, usb4 dp tunnelling should
not use the DIG encoder if the corresponded phy is used by a HPO DP2
stream.
[HOW]
A DP2 stream is treated as a dig stream.
Reviewe
From: Alex Hung
[WHAT & HOW]
Add a message so users know the stream will be used for seamless boot.
Reviewed-by: Mario Limonciello
Reviewed-by: Rodrigo Siqueira
Signed-off-by: Alex Hung
Signed-off-by: Roman Li
---
drivers/gpu/drm/amd/display/dc/core/dc_resource.c | 4 +++-
1 file changed, 3
From: Ilya Bakoulin
[Why/How]
Need to add support for full-range quantization for YCbCr in BT2020
color space.
Reviewed-by: Krunoslav Kovac
Signed-off-by: Ilya Bakoulin
Signed-off-by: Roman Li
Tested-by: Robert Mader
---
drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 6 +++---
From: Harry Wentland
Don't try to operate on a drm_wb_connector as an amdgpu_dm_connector.
While dereferencing aconnector->base will "work" it's wrong and
might lead to unknown bad things. Just... don't.
Reviewed-by: Alex Hung
Signed-off-by: Harry Wentland
Signed-off-by: Roman Li
---
.../gpu
From: Rodrigo Siqueira
Introduce the DCC and Tiling reset callback to all DCE versions that can
call it.
Reviewed-by: Alvin Lee
Signed-off-by: Rodrigo Siqueira
Signed-off-by: Roman Li
---
.../gpu/drm/amd/display/dc/core/dc_surface.c | 18 ++
.../amd/display/dc/dce60/dce60_h
From: Roman Li
Summary:
* Add support for disconnected eDP streams
* Add log for MALL entry on DCN32x
* Add DCC/Tiling reset helper for DCN and DCE
* Guard against setting dispclk low when active
* Other minor fixes
Cc: Daniel Wheeler
Alex Hung (1):
drm/amd/display: Print seamless boot mess
From: Ovidiu Bunea
[why & how]
By default, DCN HW is in idle optimized state which does not allow access
to PHY registers. If BIOS powers up the DCN, it is fine because they will
power up everything. Only exit idle optimized state when not taking control
from VBIOS.
Fixes: 53f82eb16293 ("Revert
From: Leo Zeng
This reverts commit aaa44ed6cd8af2089d2bf6a2e66a0436fef9791f.
Reason to revert: idle power regression found in testing.
Reviewed-by: Dillon Varone
Signed-off-by: Leo Zeng
Signed-off-by: Roman Li
---
drivers/gpu/drm/amd/display/dc/dml/dcn32/dcn32_fpu.c | 1 -
1 file changed, 1
[Public]
> -Original Message-
> From: Lazar, Lijo
> Sent: Friday, February 14, 2025 12:58 AM
> To: Kim, Jonathan ; amd-gfx@lists.freedesktop.org
> Subject: Re: [PATCH] drm/amdgpu: simplify xgmi peer info calls
>
>
>
> On 2/13/2025 9:20 PM, Kim, Jonathan wrote:
> > [Public]
> >
> >> -O
[AMD Official Use Only - AMD Internal Distribution Only]
Reviewed-by: Ruijing Dong
-Original Message-
From: amd-gfx On Behalf Of David Rosca
Sent: Thursday, February 13, 2025 12:07 PM
To: amd-gfx@lists.freedesktop.org
Cc: Rosca, David
Subject: [PATCH] drm/amdgpu/display: Allow DCC for
On Fri, Feb 14, 2025 at 7:32 AM Kenneth Feng wrote:
>
> extend the gfxoff delay for compute workload on smu 14.0.2/3
> to fix the kfd test issue.
This doesn't make sense. We explicitly disallow gfxoff in
amdgpu_amdkfd_set_compute_idle() already so it should already be
disallowed.
Alex
>
> Sig
We can re-order some struct members and take u32 credits outside of the
pointer sandwich and also for the last_dependency member we can get away
with an unsigned int since for dependency we use xa_limit_32b.
Pahole report before:
/* size: 160, cachelines: 3, members: 14 */
/* sum m
Now that we have a header file for internal scheduler interfaces we can
move some more prototypes into it. By doing that we eliminate the chance
of drivers trying to use something which was not intended to be used.
Signed-off-by: Tvrtko Ursulin
Cc: Christian König
Cc: Danilo Krummrich
Cc: Matth
Add a basic test for exercising modifying the entities scheduler list at
runtime.
Signed-off-by: Tvrtko Ursulin
Cc: Christian König
Cc: Danilo Krummrich
Cc: Matthew Brost
Cc: Philipp Stanner
---
drivers/gpu/drm/scheduler/tests/tests_basic.c | 73 ++-
1 file changed, 72 insert
Hi Christian,
On 11/02/2025 10:21, Christian König wrote:
Am 11.02.25 um 11:08 schrieb Philipp Stanner:
On Tue, 2025-02-11 at 09:22 +0100, Christian König wrote:
Am 06.02.25 um 17:40 schrieb Tvrtko Ursulin:
Replace a copy of DRM scheduler's to_drm_sched_job with a copy of a
newly
added __dr
Helper is for scheduler internal use so lets hide it from DRM drivers
completely.
At the same time we change the method of checking whethere there is
anything in the queue from peeking to looking at the node count.
Signed-off-by: Tvrtko Ursulin
Cc: Christian König
Cc: Danilo Krummrich
Cc: Matt
Add some basic tests for exercising entity priority handling.
Signed-off-by: Tvrtko Ursulin
Cc: Christian König
Cc: Danilo Krummrich
Cc: Matthew Brost
Cc: Philipp Stanner
---
drivers/gpu/drm/scheduler/tests/tests_basic.c | 99 ++-
1 file changed, 98 insertions(+), 1 deletion(
There has repeatedly been quite a bit of apprehension when any change to the DRM
scheduler is proposed, with two main reasons being code base is considered
fragile, not well understood and not very well documented, and secondly the lack
of systematic testing outside the vendor specific tests suites
Implement a mock scheduler backend and add some basic test to exercise the
core scheduler code paths.
Mock backend (kind of like a very simple mock GPU) can either process jobs
by tests manually advancing the "timeline" job at a time, or alternatively
jobs can be configured with a time duration in
On 14/02/2025 10:31, Christian König wrote:
Am 14.02.25 um 11:21 schrieb Tvrtko Ursulin:
Hi Christian,
On 11/02/2025 10:21, Christian König wrote:
Am 11.02.25 um 11:08 schrieb Philipp Stanner:
On Tue, 2025-02-11 at 09:22 +0100, Christian König wrote:
Am 06.02.25 um 17:40 schrieb Tvrtko Ur
Do a bit of house keeping in gpu_scheduler.h by grouping the API by type
of object it operates on.
Signed-off-by: Tvrtko Ursulin
Cc: Christian König
Cc: Danilo Krummrich
Cc: Matthew Brost
Cc: Philipp Stanner
---
include/drm/gpu_scheduler.h | 60 -
1 file c
Replace a copy of DRM scheduler's to_drm_sched_job with a copy of a newly
added drm_sched_entity_queue_pop.
This allows breaking the hidden dependency that queue_node has to be the
first element in struct drm_sched_job.
A comment is also added with a reference to the mailing list discussion
expla
Lets add some helpers for peeking and popping from the job queue which allows us
to re-order the fields in struct drm_sched_job and remove one hole.
As in the process we have added a header file for scheduler internal prototypes,
lets also use it more and cleanup the "exported" header a bit.
v2:
Add a very simple timeout test which submits a single job and verifies
that the timeout handling will run if the backend failed to complete the
job in time.
Signed-off-by: Tvrtko Ursulin
Cc: Christian König
Cc: Danilo Krummrich
Cc: Matthew Brost
Cc: Philipp Stanner
---
.../gpu/drm/scheduler/
Move some options out into a new debug specific kconfig file in order to
make things a bit cleaner.
Signed-off-by: Tvrtko Ursulin
---
drivers/gpu/drm/Kconfig | 109 ++
drivers/gpu/drm/Kconfig.debug | 103
2 files changed, 108
Idea is to add helpers for peeking and popping jobs from entities with
the goal of decoupling the hidden assumption in the code that queue_node
is the first element in struct drm_sched_job.
That assumption usually comes in the form of:
while ((job = to_drm_sched_job(spsc_queue_pop(&entity->job_
On 2/14/2025 6:09 PM, Christian König wrote:
Yeah, completely agree.
But not checking the syncobj handle before doing the update is actually even
more problematic than leaking the memory.
This could be used by userspace to put the kernel into a broken situation it
can't come out any more.
Yeah, completely agree.
But not checking the syncobj handle before doing the update is actually even
more problematic than leaking the memory.
This could be used by userspace to put the kernel into a broken situation it
can't come out any more.
Arvin can you take care of the complete fix?
Tha
[AMD Official Use Only - AMD Internal Distribution Only]
Better to put the fence outside amdgpu_gem_va_update_vm. Since it is passed to
the caller, and the caller must keep one reference at least until this fence is
no longer needed.
Thanks
River
-Original Message-
From: amd-gfx On Be
[Public]
The implementation of the gfx11/gfx12 pipe reset is derived from the gfx9 pipe
reset sequence. Consequently, the driver sequence may not undergo significant
changes except for incorporating gfx11/gfx12 firmware support for the pipe
reset. To reduce the effort needed to address merge co
On 2/14/2025 4:08 PM, Christian König wrote:
Adding Arvind, please make sure to keep him in the loop.
Am 14.02.25 um 11:07 schrieb Le Ma:
On systems with CONFIG_SLUB_DEBUG enabled, the memleak like below
will show up explicitly during driver unloading if created bo without
drm_timeline object
extend the gfxoff delay for compute workload on smu 14.0.2/3
to fix the kfd test issue.
Signed-off-by: Kenneth Feng
---
drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c | 3 +++
drivers/gpu/drm/amd/pm/amdgpu_dpm.c | 14 ++
drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h | 1 +
Am 14.02.25 um 11:34 schrieb Tvrtko Ursulin:
>
> On 14/02/2025 10:31, Christian König wrote:
>> Am 14.02.25 um 11:21 schrieb Tvrtko Ursulin:
>>>
>>> Hi Christian,
>>>
>>> On 11/02/2025 10:21, Christian König wrote:
Am 11.02.25 um 11:08 schrieb Philipp Stanner:
> On Tue, 2025-02-11 at 09:22
Adding Arvind, please make sure to keep him in the loop.
Am 14.02.25 um 11:07 schrieb Le Ma:
> On systems with CONFIG_SLUB_DEBUG enabled, the memleak like below
> will show up explicitly during driver unloading if created bo without
> drm_timeline object before.
>
> BUG drm_sched_fence (Tainte
Am 14.02.25 um 11:21 schrieb Tvrtko Ursulin:
>
> Hi Christian,
>
> On 11/02/2025 10:21, Christian König wrote:
>> Am 11.02.25 um 11:08 schrieb Philipp Stanner:
>>> On Tue, 2025-02-11 at 09:22 +0100, Christian König wrote:
Am 06.02.25 um 17:40 schrieb Tvrtko Ursulin:
> Replace a copy of DRM
On systems with CONFIG_SLUB_DEBUG enabled, the memleak like below
will show up explicitly during driver unloading if created bo without
drm_timeline object before.
BUG drm_sched_fence (Tainted: G OE ): Objects remaining in
drm_sched_fence on __kmem_cache_shutdown()
On 2/14/2025 2:39 PM, Christian König wrote:
Am 14.02.25 um 09:57 schrieb Srinivasan Shanmugam:
RLCG Register Access is a way for virtual functions to safely access GPU
registers in a virtualized environment., including TLB flushes and
register reads. When multiple threads or VFs try to access
Am 14.02.25 um 09:57 schrieb Srinivasan Shanmugam:
> RLCG Register Access is a way for virtual functions to safely access GPU
> registers in a virtualized environment., including TLB flushes and
> register reads. When multiple threads or VFs try to access the same
> registers simultaneously, it can
RLCG Register Access is a way for virtual functions to safely access GPU
registers in a virtualized environment., including TLB flushes and
register reads. When multiple threads or VFs try to access the same
registers simultaneously, it can lead to race conditions. By using the
RLCG interface, the
Am 13.02.25 um 18:50 schrieb Srinivasan Shanmugam:
> By adding these NULL pointer checks and improving error handling, we can
> prevent crashes when the enforce_isolation sysfs file is accessed on
> non-supported systems.
>
> Cc: Christian König
> Cc: Alex Deucher
> Signed-off-by: Srinivasan Shan
Gentle ping :)
On 1/14/25 16:58, Nikita Zhandarovich wrote:
> This patch removes useless NULL pointer checks in functions like
> ci_set_private_data_variables_based_on_pptable() and
> ci_setup_default_dpm_tables().
>
> The pointers in question are initialized as addresses to existing
> structures
-Wflex-array-member-not-at-end was introduced in GCC-14, and we are
getting ready to enable it, globally.
So, in order to avoid ending up with a flexible-array member in the
middle of other structs, we use the `struct_group_tagged()` helper
to create a new tagged `struct NISLANDS_SMC_SWSTATE_HDR`
On 2/14/2025 5:43 AM, Victor Lu wrote:
> Aldebaran SRIOV VF does not have write permissions to GRBM_CTNL.
> This access can be skipped to avoid a dmesg warning.
>
> Signed-off-by: Victor Lu
> ---
> drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 4 +++-
> 1 file changed, 3 insertions(+), 1 deletion(-
Generate CPER record when bad page threshold exceed and
commit to CPER ring.
v2: return -ENOMEM instead of false
v2: check return value of fill section function
Signed-off-by: Xiang Liu
Reviewed-by: Tao Zhou
---
drivers/gpu/drm/amd/amdgpu/amdgpu_cper.c | 23 +++
drivers/gpu
On 2/14/2025 5:43 AM, Victor Lu wrote:
> Aldebaran SRIOV VF cannot access the power brake feature regs.
> The accesses can be skipped to avoid a dmesg warning.
>
> Signed-off-by: Victor Lu
> ---
> drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
Commit the CPER entry to the ring buffer.
Signed-off-by: Xiang Liu
Reviewed-by: Tao Zhou
---
drivers/gpu/drm/amd/amdgpu/amdgpu_cper.c | 6 --
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cper.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_cper.c
in
From: Tao Zhou
Old CPER data will be overwritten if ring buffer is full, and read
pointer always points to CPER header.
Signed-off-by: Tao Zhou
Reviewed-by: Hawking Zhang
---
drivers/gpu/drm/amd/amdgpu/amdgpu_cper.c | 93
drivers/gpu/drm/amd/amdgpu/amdgpu_cper.h | 2
From: Hawking Zhang
AMD is using Common Platform Error Record (CPER) format
to report all gpu hardware errors.
v2: add program attribute
Signed-off-by: Hawking Zhang
Signed-off-by: Xiang Liu
Reviewed-by: Tao Zhou
---
drivers/gpu/drm/amd/include/amd_cper.h | 269 +
1
From: Tao Zhou
Avoid the confliction between read and write of ring buffer.
Signed-off-by: Tao Zhou
Reviewed-by: Hawking Zhang
---
drivers/gpu/drm/amd/amdgpu/amdgpu_cper.c | 4
drivers/gpu/drm/amd/amdgpu/amdgpu_cper.h | 1 +
drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c | 21 +++
Get system local time and encode it to timestamp for CPER.
Signed-off-by: Xiang Liu
Reviewed-by: Tao Zhou
---
drivers/gpu/drm/amd/amdgpu/amdgpu_cper.c | 19 ++-
1 file changed, 18 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cper.c
b/drivers/gpu/
From: Hawking Zhang
Encode the error information in CPER format and commit
to the cper ring
Signed-off-by: Hawking Zhang
Reviewed-by: Yang Wang
Reviewed-by: Tao Zhou
---
drivers/gpu/drm/amd/amdgpu/amdgpu_aca.c | 32 +
1 file changed, 32 insertions(+)
diff --git a/dri
From: Tao Zhou
We read CPER data from read pointer to write pointer without changing
the pointers.
Signed-off-by: Tao Zhou
Reviewed-by: Hawking Zhang
---
drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c | 47 ++--
1 file changed, 36 insertions(+), 11 deletions(-)
diff --git a/dri
This patch series generate RAS CPER records for UE/DE/CE/BP threshold exceed
event. SMU_TYPE_CE banks are combined into 1 CPER entry, they could be CEs or
DEs or both. UEs and BPs are encoded into separate CPER entries.
RAS CPER records for CEs will be generated only after CEs count been queried.
From: Hawking Zhang
Introduce new functions that are used to generate
cper ue or ce records.
v2: return -ENOMEM instead of false
v2: check return value of fill section function
Signed-off-by: Hawking Zhang
Signed-off-by: Xiang Liu
Reviewed-by: Yang Wang
Reviewed-by: Tao Zhou
---
drivers/gp
From: Tao Zhou
And initialize it, this is a pure software ring to store RAS CPER data.
v2: update the initialization of count_dw of cper ring, it's dword
variable.
Signed-off-by: Tao Zhou
Reviewed-by: Hawking Zhang
---
drivers/gpu/drm/amd/amdgpu/amdgpu_cper.c | 39 +++---
d
From: Hawking Zhang
Introduce utility functions designed to assist
in populating CPER records.
v2: call cper_init/fini in device_ip_init/fini.
Signed-off-by: Hawking Zhang
Reviewed-by: Tao Zhou
---
drivers/gpu/drm/amd/amdgpu/Makefile| 2 +-
drivers/gpu/drm/amd/amdgpu/amdgpu.h
From: Hawking Zhang
ACA error types managed by driver a direct 1:1
correspondence with those managed by firmware.
To address this, for each ACA bank, include
both the ACA error type and the ACA SMU type.
This addition is useful for creating CPER records.
Signed-off-by: Hawking Zhang
Reviewed-
90 matches
Mail list logo