On Fri, Jul 19, 2024 at 9:39 AM Alex Deucher wrote:
>
> On Thu, Jul 18, 2024 at 1:00 PM Friedrich Vock wrote:
> >
> > Hi,
> >
> > On 18.07.24 16:06, Alex Deucher wrote:
> > > This adds preliminary support for GC per queue reset. In this
> > > case, only the jobs currently in the queue are lost.
From: Xiaogang Chen
When app unmap vm ranges(munmap) kfd/svm starts drain pending page fault and
not handle any incoming pages fault of this process until a deferred work item
got executed by default system wq. The time period of "not handle page fault"
can be long and is unpredicable. That is ad
[Public]
> -Original Message-
> From: Kuehling, Felix
> Sent: Friday, July 19, 2024 2:34 PM
> To: Kim, Jonathan ; amd-gfx@lists.freedesktop.org
> Subject: Re: [PATCH] drm/amdkfd: allow users to target recommended SDMA
> engines
>
> On 2024-07-18 19:05, Jonathan Kim wrote:
> > Certain GPUs
On 2024-07-18 19:05, Jonathan Kim wrote:
Certain GPUs have better copy performance over xGMI on specific
SDMA engines depending on the source and destination GPU.
Allow users to create SDMA queues on these recommended engines.
Close to 2x overall performance has been observed with this
optimizati
On Fri, Jul 19, 2024 at 05:18:05PM +0200, Christian König wrote:
> Am 19.07.24 um 15:02 schrieb Christian König:
> > Am 19.07.24 um 11:47 schrieb Tvrtko Ursulin:
> > > From: Tvrtko Ursulin
> > >
> > > Long time ago in commit b3ac17667f11 ("drm/scheduler: rework entity
> > > creation") a change wa
'stream_enc_regs' array is an array of dcn10_stream_enc_registers
structures. The array is initialized with four elements, corresponding
to the four calls to stream_enc_regs() in the array initializer. This
means that valid indices for this array are 0, 1, 2, and 3.
The error message 'stream_enc_r
While Alex already fixed a bunch of them we still have tons of call
paths which are accessing the hw without holding the reset lock to
prevent concurrent GPU resets.
Start pointing those out so that we can eventually fix them. Only
point out the first misbehavior per driver load so that we won't
o
Am 19.07.24 um 11:16 schrieb Jack Xiao:
wait memory room until enough before writing mes packets
to avoid ring buffer overflow.
Signed-off-by: Jack Xiao
---
drivers/gpu/drm/amd/amdgpu/mes_v11_0.c | 18 ++
drivers/gpu/drm/amd/amdgpu/mes_v12_0.c | 18 ++
2 file
Am 19.07.24 um 15:02 schrieb Christian König:
Am 19.07.24 um 11:47 schrieb Tvrtko Ursulin:
From: Tvrtko Ursulin
Long time ago in commit b3ac17667f11 ("drm/scheduler: rework entity
creation") a change was made which prevented priority changes for
entities
with only one assigned scheduler.
Th
On Fri, Jul 19, 2024 at 5:35 AM Jack Xiao wrote:
>
> wait memory room until enough before writing mes packets
> to avoid ring buffer overflow.
>
> Signed-off-by: Jack Xiao
Fixes: de3246254156 ("drm/amdgpu: cleanup MES11 command submission")
Fixes: fffe347e1478 ("drm/amdgpu: cleanup MES12 command
On Thu, Jul 18, 2024 at 1:00 PM Friedrich Vock wrote:
>
> Hi,
>
> On 18.07.24 16:06, Alex Deucher wrote:
> > This adds preliminary support for GC per queue reset. In this
> > case, only the jobs currently in the queue are lost. If this
> > fails, we fall back to a full adapter reset.
>
> First o
In amdgpu_connector_add_common_modes(), the return value of drm_cvt_mode()
is assigned to mode, which will lead to a NULL pointer dereference on
failure of drm_cvt_mode(). Add a check to avoid npd.
Cc: sta...@vger.kernel.org
Fixes: d38ceaf99ed0 ("drm/amdgpu: add core driver (v4)")
Signed-off-by: M
Return 0 to avoid returning an uninitialized variable r.
Cc: sta...@vger.kernel.org
Fixes: 230dd6bb6117 ("drm/amd/amdgpu: implement mode2 reset on smu_v13_0_10")
Signed-off-by: Ma Ke
---
Changes in v2:
- added Cc stable line.
---
drivers/gpu/drm/amd/amdgpu/smu_v13_0_10.c | 2 +-
1 file changed,
In radeon_add_common_modes(), the return value of drm_cvt_mode() is
assigned to mode, which will lead to a possible NULL pointer dereference
on failure of drm_cvt_mode(). Add a check to avoid npd.
Cc: sta...@vger.kernel.org
Fixes: d50ba256b5f1 ("drm/kms: start adding command line interface using f
Am 19.07.24 um 11:47 schrieb Tvrtko Ursulin:
From: Tvrtko Ursulin
Long time ago in commit b3ac17667f11 ("drm/scheduler: rework entity
creation") a change was made which prevented priority changes for entities
with only one assigned scheduler.
The commit reduced drm_sched_entity_set_priority()
Am 19.07.24 um 11:36 schrieb Yin, ZhenGuo (Chris):
[AMD Official Use Only - AMD Internal Distribution Only]
Hi, Christian
Why loosing VRAM would result in the vm entity to become invalid?
I think only if there has a fence error appeared(like a pending SDMA job got
timedout or cancelled), then
On 17/07/2024 16:02, Paneer Selvam, Arunpravin wrote:
On 7/16/2024 3:34 PM, Matthew Auld wrote:
On 16/07/2024 10:50, Paneer Selvam, Arunpravin wrote:
Hi Matthew,
On 7/10/2024 6:20 PM, Matthew Auld wrote:
On 10/07/2024 07:03, Paneer Selvam, Arunpravin wrote:
Thanks Alex.
Hi Matthew,
Any co
From: Arnd Bergmann
Multiple files in amdgpu call amdgpu_ucode_request() with a fw_name
variable that the compiler cannot check for being a valid format string,
as seen by enabling the (default-disabled) -Wformat-security option:
drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c: In function
'amdgpu_mes_
From: Tvrtko Ursulin
Long time ago in commit b3ac17667f11 ("drm/scheduler: rework entity
creation") a change was made which prevented priority changes for entities
with only one assigned scheduler.
The commit reduced drm_sched_entity_set_priority() to simply update the
entities priority, but the
[AMD Official Use Only - AMD Internal Distribution Only]
Hi, Christian
Why loosing VRAM would result in the vm entity to become invalid?
I think only if there has a fence error appeared(like a pending SDMA job got
timedout or cancelled), then the entity vm->delayed will be set as error.
If a g
Am 19.07.24 um 11:19 schrieb ZhenGuo Yin:
[Why]
Page table of compute VM in the VRAM will lost after gpu reset.
VRAM won't be restored since compute VM has no shadows.
[How]
Use higher 32-bit of vm->generation to record a vram_lost_counter.
Reset the VM state machine when the counter is not equa
[Why]
Page table of compute VM in the VRAM will lost after gpu reset.
VRAM won't be restored since compute VM has no shadows.
[How]
Use higher 32-bit of vm->generation to record a vram_lost_counter.
Reset the VM state machine when the counter is not equal to current
vram_lost_counter of the device
wait memory room until enough before writing mes packets
to avoid ring buffer overflow.
Signed-off-by: Jack Xiao
---
drivers/gpu/drm/amd/amdgpu/mes_v11_0.c | 18 ++
drivers/gpu/drm/amd/amdgpu/mes_v12_0.c | 18 ++
2 files changed, 28 insertions(+), 8 deletions(-)
Am 18.07.24 um 23:12 schrieb Felix Kuehling:
On 2024-07-18 17:05, Philip Yang wrote:
This patch series do additional queue buffers validation in the queue
creation IOCTLS, fail the queue creation if buffers not mapped on the GPU
with the expected size.
Ensure queue buffers residency by tracking
[AMD Official Use Only - AMD Internal Distribution Only]
Same issue on CPU page table update.
-Original Message-
From: Kuehling, Felix
Sent: Thursday, July 18, 2024 12:28 AM
To: Christian König ; YuanShang Mao (River)
; Huang, Trigger ;
amd-gfx@lists.freedesktop.org; cao, lin
Subject:
25 matches
Mail list logo