Am 17.05.22 um 21:20 schrieb Andrey Grodzovsky:
Problem:
During hive reset caused by command timing out on a ring
extra resets are generated by triggered by KFD which is
unable to accesses registers on the resetting ASIC.
Fix: Rework GPU reset to actively stop any pending reset
works while anoth
Am 17.05.22 um 21:20 schrieb Andrey Grodzovsky:
We need to be able to non blocking cancel pending reset works
from within GPU reset. Currently kernel API allows this only
for delayed_work and not for work_struct. Switch to delayed
work and queue it with delay 0 which is equal to queueing work
str
Am 17.05.22 um 21:20 schrieb Andrey Grodzovsky:
Will be read by executors of async reset like debugfs.
Signed-off-by: Andrey Grodzovsky
---
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 6 --
drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c | 1 +
drivers/gpu/drm/amd/amdgpu/amdgpu_reset.h | 1
On Mon-16-05-2022 02:09 pm, Jani Nikula wrote:
On Mon, 02 May 2022, Harry Wentland wrote:
Both the kernel and IGT series look good to me.
I recommend you merge the entire kernel set as one into drm-next. We
can pull it into amd-staging-drm-next so as not to break our CI once
the IGT patches la
Disable ABM feature when the system is running on AC mode to get the more
perfect contrast of the display.
v2: remove "UPSTREAM" from the subject.
v3: adv->pm.ac_power updating by amd gpu_acpi_event_handler.
v4: Add the file I lost to fix the build error.
v5: Move that function of the setting a
[WHY]
Unified memory with xnack off should be tracked, as userptr mappings
and legacy allocations do. To avoid oversuscribe system memory when
xnack off.
[How]
Exposing functions reserve_mem_limit and unreserve_mem_limit to SVM
API and call them on every prange creation and free.
Signed-off-by: Al
TTM used to track the "acc_size" of all BOs internally. We needed to
keep track of it in our memory reservation to avoid TTM running out
of memory in its own accounting. However, that "acc_size" accounting
has since been removed from TTM. Therefore we don't really need to
track it any more.
Signed
tree/branch:
https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git master
branch HEAD: 47c1c54d1bcd0a69a56b49473bc20f17b70e5242 Add linux-next specific
files for 20220517
Error/Warning reports:
https://lore.kernel.org/linux-mm/202204181931.klac6fwo-...@intel.com
https
Most of changes are for debugger feature, and it is
to simplify trap handler support for new asics in the
future.
Signed-off-by: Eric Huang
---
.../gpu/drm/amd/amdkfd/cwsr_trap_handler.h| 2527 +
.../amd/amdkfd/cwsr_trap_handler_gfx10.asm| 325 ++-
.../drm/amd/amdkfd/cws
I will try to take a look at this during this week btw
On Tue, 2022-05-10 at 17:56 +0800, Wayne Lin wrote:
> This patch set is trying to resolve issues observed when unplug monitors
> under mst scenario. Revert few commits which cause side effects and seems
> no longer needed. And propose a patch
Series is:
Reviewed-by: Alex Deucher
On Tue, May 17, 2022 at 12:29 PM Sunil Khatri wrote:
>
> Add support of IP GC 10.3.7 in amdgpu_gmc_tmz_set.
>
> Signed-off-by: Sunil Khatri
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/drivers/gp
On Mon, May 16, 2022 at 3:20 PM Eric Huang wrote:
>
> It is to simplify trap handler support for new asics in
> the future.
It would be good to provide a basic overview of what the changes are.
Alex
>
> Signed-off-by: Eric Huang
> ---
> .../gpu/drm/amd/amdkfd/cwsr_trap_handler.h| 2527 ++
Applied. Thanks!
On Tue, May 17, 2022 at 9:13 AM Yuanjun Gong wrote:
>
> From: Gong Yuanjun
>
> gpu_metrics_table is allocated in yellow_carp_init_smc_tables() but
> not freed in yellow_carp_fini_smc_tables().
>
> Signed-off-by: Gong Yuanjun
> ---
> drivers/gpu/drm/amd/pm/swsmu/smu13/yellow_c
Am 2022-05-17 um 15:21 schrieb Andrey Grodzovsky:
We skip rest requests if another one is already in progress.
Signed-off-by: Andrey Grodzovsky
---
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 27 ++
1 file changed, 27 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgp
Applied. Thanks!
On Tue, May 17, 2022 at 9:13 AM Yuanjun Gong wrote:
>
> From: Gong Yuanjun
>
> In radeon_fp_native_mode(), the return value of drm_mode_duplicate()
> is assigned to mode, which will lead to a NULL pointer dereference
> on failure of drm_mode_duplicate(). Add a check to avoid np
From: Evan Quan
Correct the metrics version used for SMU 11.0.11/12/13.
Fixes misreported GPU metrics (e.g., fan speed, etc.) depending
on which version of SMU firwmare is loaded.
Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1925
Signed-off-by: Evan Quan
Signed-off-by: Alex Deucher
---
Applied with a reworked commit message.
Thanks,
Alex
On Tue, May 17, 2022 at 7:24 AM wrote:
>
> From: Haohui Mai
>
> Signed-off-by: Haohui Mai
> ---
> drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 27 +-
> 1 file changed, 13 insertions(+), 14 deletions(-)
>
> diff --git a/
We removed the wrapper that was queueing the recover function
into reset domain queue who was using this name.
Signed-off-by: Andrey Grodzovsky
---
drivers/gpu/drm/amd/amdgpu/amdgpu.h| 2 +-
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 2 +-
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 2
We skip rest requests if another one is already in progress.
Signed-off-by: Andrey Grodzovsky
---
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 27 ++
1 file changed, 27 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu
We need to have a delayed work to cancel this reset if another
already in progress.
Signed-off-by: Andrey Grodzovsky
---
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 15 ++-
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h | 1 +
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 31 ---
We need to have a delayed work to cancel this reset if another
already in progress.
Signed-off-by: Andrey Grodzovsky
---
drivers/gpu/drm/amd/amdgpu/amdgpu.h | 2 ++
drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c | 19 +--
2 files changed, 19 insertions(+), 2 deletions(-)
diff
Will be read by executors of async reset like debugfs.
Signed-off-by: Andrey Grodzovsky
---
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 6 --
drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c | 1 +
drivers/gpu/drm/amd/amdgpu/amdgpu_reset.h | 1 +
3 files changed, 6 insertions(+), 2 deletions(-)
Save the extra usless work schedule. Also swith to delayed work.
Signed-off-by: Andrey Grodzovsky
---
drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 12 +++-
drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h | 2 +-
2 files changed, 8 insertions(+), 6 deletions(-)
diff --git a/drivers/gpu/drm/amd/amd
We need to be able to non blocking cancel pending reset works
from within GPU reset. Currently kernel API allows this only
for delayed_work and not for work_struct. Switch to delayed
work and queue it with delay 0 which is equal to queueing work
struct.
Signed-off-by: Andrey Grodzovsky
---
drive
Problem:
During hive reset caused by command timing out on a ring
extra resets are generated by triggered by KFD which is
unable to accesses registers on the resetting ASIC.
Fix: Rework GPU reset to actively stop any pending reset
works while another in progress.
v2: Switch from generic list as
On Tue, May 17, 2022 at 2:06 AM wrote:
>
> From: Haohui Mai
>
> Remove the accidental shifts on the values of RPTR_BLOCK_SIZE in gfx_v8-v11.
> The bug essentially always programs the corresponding fields to zero
> instead of the correct value.
The hardware clamps values below 5 to 5. Updated th
Please feel free to use:
Reviewed-by: Shashank Sharma
On 5/17/2022 12:36 PM, Christian König wrote:
Convert fdinfo format to one documented in drm-usage-stats.rst.
It turned out that the existing implementation was actually completely
nonsense. The calculated percentages indeed represented the
On Tue, May 17, 2022 at 1:00 PM Mario Limonciello
wrote:
>
> An A+A configuration on ASUS ROG Strix G513QY proves that the ASIC
> reset for handling aborted suspend can't work with s2idle.
>
> This functionality was introduced in commit daf8de0874ab5b ("drm/amdgpu:
> always reset the asic in suspe
An A+A configuration on ASUS ROG Strix G513QY proves that the ASIC
reset for handling aborted suspend can't work with s2idle.
This functionality was introduced in commit daf8de0874ab5b ("drm/amdgpu:
always reset the asic in suspend (v2)"). A few other commits have
gone on top of the ASIC reset, b
[Public]
No, it mode2 reset that it uses for failure case.
From: Lazar, Lijo
Sent: Tuesday, May 17, 2022 11:51
To: Limonciello, Mario ;
amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/amd: Don't reset dGPUs if the system is going to s2idle
[Public]
Ya, second is too lengthy. Better t
[Public]
Ya, second is too lengthy. Better to leave it as it is.
BTW, is this specific to reset by BACO? BACO entry/exit may take longer (better
chance of suspend entry abort by some wake-up source).
Thanks,
Lijo
From: Limonciello, Mario
Sent: Tuesday, May 17,
On Tue, May 17, 2022 at 12:30 PM Limonciello, Mario
wrote:
>
> [Public]
>
>
>
> > PM_SUSPEND_TO_IDLE should be under a compile guard
>
>
>
> It is actually. All of the amdgpu_acpi_* are. It’s not obvious though
> looking at the patch, you need to apply it to notice it.
>
>
>
> > It makes sense
[Public]
> PM_SUSPEND_TO_IDLE should be under a compile guard
It is actually. All of the amdgpu_acpi_* are. It's not obvious though looking
at the patch, you need to apply it to notice it.
> It makes sense to rename to something like amdgpu_need_reset_on_suspend() as
> it decides on reset on
Add support of IP GC 10.3.7 in amdgpu_gmc_tmz_set.
Signed-off-by: Sunil Khatri
---
drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
index 7e55ee61f84c..798c56214a23 100
Use IP version rather then code name of IPs for
tmz set.
Signed-off-by: Sunil Khatri
---
drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 27 -
1 file changed, 18 insertions(+), 9 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
b/drivers/gpu/drm/amd/amdgpu/am
To enable TMZ feature based on IP version needs adev->ip_version
populated but its empty. Move amdgpu_gmc_tmz_set to a place where
ip_version is populated.
Signed-off-by: Sunil Khatri
---
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff
[Public]
A couple of things -
PM_SUSPEND_TO_IDLE should be under a compile guard
It makes sense to rename to something like amdgpu_need_reset_on_suspend() as it
decides on reset only for a suspend situation.
Thanks,
Lijo
Typically the acpi_video driver will initialize before radeon, which
used to cause /sys/class/backlight/acpi_video0 to get registered and then
radeon would register its own radeon_bl# device later. After which
the drivers/acpi/video_detect.c code unregistered the acpi_video0 device
to avoid there b
Typically the acpi_video driver will initialize before amdgpu, which
used to cause /sys/class/backlight/acpi_video0 to get registered and then
amdgpu would register its own amdgpu_bl# device later. After which
the drivers/acpi/video_detect.c code unregistered the acpi_video0 device
to avoid there b
Typically the acpi_video driver will initialize before nouveau, which
used to cause /sys/class/backlight/acpi_video0 to get registered and then
nouveau would register its own nv_backlight device later. After which
the drivers/acpi/video_detect.c code unregistered the acpi_video0 device
to avoid the
On machins without an i915 opregion the acpi_video driver immediately
probes the ACPI video bus and used to also immediately register
acpi_video# backlight devices when supported.
Once the drm/kms driver then loaded later and possibly registered
a native backlight device then the drivers/acpi/vide
When acpi_video_register() has not run yet the video_bus_head will be
empty, so there is no need to check the register_count flag first.
Signed-off-by: Hans de Goede
---
drivers/acpi/acpi_video.c | 12
1 file changed, 4 insertions(+), 8 deletions(-)
diff --git a/drivers/acpi/acpi_v
Remove the code to unregister acpi_video backlight devices when
a native backlight device gets registered later.
Now that the acpi_video backlight device registration is a separate step
which runs later, after the drm/kms driver is done setting up its own
native backlight device, it is no longer n
On x86/ACPI boards the acpi_video driver will usually initializing before
the kms driver (except i915). This causes /sys/class/backlight/acpi_video0
to show up and then the kms driver registers its own native backlight
device after which the drivers/acpi/video_detect.c code unregisters
the acpi_vid
Now that all kms drivers which register native/BACKLIGHT_RAW type backlight
devices on x86/ACPI boards call acpi_video_get_backlight_type(true), with
the native=true value getting cached, there no longer is a need to call
backlight_device_get_by_type(BACKLIGHT_RAW) to see if a native backlight
devi
Move the list_del removing an acpi_video_bus from video_bus_head
on teardown to before the teardown is done, to avoid code iterating
over the video_bus_head list seeing acpi_video_bus objects on there
which are (partly) torn down already.
Signed-off-by: Hans de Goede
---
drivers/acpi/acpi_video.
Before this commit when we want userspace to use the acpi_video backlight
device we register both the GPU's native backlight device and acpi_video's
firmware acpi_video# backlight device. This relies on userspace preferring
firmware type backlight devices over native ones.
Registering 2 backlight
Hi All,
As mentioned in my RFC titled "drm/kms: control display brightness through
drm_connector properties":
https://lore.kernel.org/dri-devel/0d188965-d809-81b5-74ce-7d30c49fe...@redhat.com/
The first step towards this is to deal with some existing technical debt
in backlight handling on x86/AC
Before this commit when we want userspace to use the acpi_video backlight
device we register both the GPU's native backlight device and acpi_video's
firmware acpi_video# backlight device. This relies on userspace preferring
firmware type backlight devices over native ones.
Registering 2 backlight
Before this commit when we want userspace to use the acpi_video backlight
device we register both the GPU's native backlight device and acpi_video's
firmware acpi_video# backlight device. This relies on userspace preferring
firmware type backlight devices over native ones.
Registering 2 backlight
Before this commit when we want userspace to use the acpi_video backlight
device we register both the GPU's native backlight device and acpi_video's
firmware acpi_video# backlight device. This relies on userspace preferring
firmware type backlight devices over native ones.
Registering 2 backlight
ATM on x86 laptops where we want userspace to use the acpi_video backlight
device we often register both the GPU's native backlight device and
acpi_video's firmware acpi_video# backlight device. This relies on
userspace preferring firmware type backlight devices over native ones, but
registering 2
On Tue, May 17, 2022 at 10:06 AM Limonciello, Mario
wrote:
>
> [Public]
>
>
>
> > -Original Message-
> > From: Alex Deucher
> > Sent: Tuesday, May 17, 2022 08:43
> > To: Limonciello, Mario
> > Cc: amd-gfx list
> > Subject: Re: [PATCH] drm/amd: Don't reset dGPUs if the system is going to
[Public]
> -Original Message-
> From: Alex Deucher
> Sent: Tuesday, May 17, 2022 08:43
> To: Limonciello, Mario
> Cc: amd-gfx list
> Subject: Re: [PATCH] drm/amd: Don't reset dGPUs if the system is going to
> s2idle
>
> On Tue, May 17, 2022 at 9:39 AM Mario Limonciello
> wrote:
> >
On Tue, May 17, 2022 at 9:39 AM Mario Limonciello
wrote:
>
> An A+A configuration on ASUS ROG Strix G513QY proves that the ASIC
> reset for handling aborted suspend can't work with s2idle.
>
> This functionality was introduced in commit daf8de0874ab5b ("drm/amdgpu:
> always reset the asic in suspe
An A+A configuration on ASUS ROG Strix G513QY proves that the ASIC
reset for handling aborted suspend can't work with s2idle.
This functionality was introduced in commit daf8de0874ab5b ("drm/amdgpu:
always reset the asic in suspend (v2)"). A few other commits have
gone on top of the ASIC reset, b
From: Gong Yuanjun
In radeon_fp_native_mode(), the return value of drm_mode_duplicate()
is assigned to mode, which will lead to a NULL pointer dereference
on failure of drm_mode_duplicate(). Add a check to avoid npd.
The failure status of drm_cvt_mode() on the other path is checked too.
Signed-
On Tue, May 17, 2022 at 05:57:46PM +0800, Yuanjun Gong wrote:
> From: Gong Yuanjun
>
> gpu_metrics_table is allocated in yellow_carp_init_smc_tables() but
> not freed in yellow_carp_fini_smc_tables().
>
> Signed-off-by: Gong Yuanjun
> ---
> drivers/gpu/drm/amd/pm/swsmu/smu13/yellow_carp_ppt.c
From: Gong Yuanjun
gpu_metrics_table is allocated in yellow_carp_init_smc_tables() but
not freed in yellow_carp_fini_smc_tables().
Signed-off-by: Gong Yuanjun
---
drivers/gpu/drm/amd/pm/swsmu/smu13/yellow_carp_ppt.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/gpu/drm/amd/pm/
On Tue, May 17, 2022 at 05:57:00PM +0800, Yuanjun Gong wrote:
> From: Gong Yuanjun
>
> In radeon_fp_native_mode(), the return value of drm_mode_duplicate()
> is assigned to mode, which will lead to a NULL pointer dereference
> on failure of drm_mode_duplicate(). Add a check to avoid npd.
>
> The
From: Haohui Mai
Signed-off-by: Haohui Mai
---
drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 27 +-
1 file changed, 13 insertions(+), 14 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
index dd8f4344eeb8..9a1b42cc850
This is enough to get gputop working :)
v2: rebase and some addition cleanup
Signed-off-by: Christian König
Reviewed-by: Shashank Sharma (v1)
---
drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c | 15 +++
1 file changed, 7 insertions(+), 8 deletions(-)
diff --git a/drivers/gpu/drm/amd/a
Convert fdinfo format to one documented in drm-usage-stats.rst.
It turned out that the existing implementation was actually completely
nonsense. The calculated percentages indeed represented the usage of the
engine, but with varying time slices.
So 10% usage for application A could mean something
63 matches
Mail list logo