For partial migrate from ram to vram, the migrate->cpages is not
equal to migrate->npages, should use migrate->npages to check all needed
migrate pages which could be copied or not.
And only need to set those pages could be migrated to migrate->dst[i], or
the migrate_vma_pages will migrate the wro
> 2025年1月10日 14:51,Christian König 写道:
>
> Am 10.01.25 um 03:08 schrieb Jiang Liu:
>> Function detects initialization status by checking sched->ops, so set
>> sched->ops to non-NULL just before return in function
>> amdgpu_fence_driver_sw_fini() and amdgpu_device_init_schedulers()
>> to avoid
Am 10.01.25 um 03:08 schrieb Jiang Liu:
Function detects initialization status by checking sched->ops, so set
sched->ops to non-NULL just before return in function
amdgpu_fence_driver_sw_fini() and amdgpu_device_init_schedulers()
to avoid possible invalid memory access on error recover path.
Sig
Am 10.01.25 um 04:38 schrieb Srinivasan Shanmugam:
This commit adds mutex locking to the `amdgpu_vmid_mgr_init` function.
By acquiring and releasing the `enforce_isolation_mutex`, so that it now
safely allocates reserved VMIDs, which is important for enforcing
isolation between different GPU proc
From: "jesse.zh...@amd.com"
Using mmio to do queue reset
v2: Alignment the function with gfx9/gfx9.4.3.
Signed-off-by: Jesse Zhang adev;
unsigned i;
+ uint32_t tmp;
/* enter save mode */
amdgpu_gfx_rlc_enter_safe_mode(adev, xcc_id);
@@ -3813,7 +3814,25 @@ static
From: "jesse.zh...@amd.com"
Using mmio to do queue reset.
v2: Alignment this function with gfx9/gfx9.4.3.
Signed-off-by: Jesse Zhang
---
drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 34 ++
1 file changed, 34 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.
This commit adds mutex locking to the `amdgpu_vmid_mgr_init` function.
By acquiring and releasing the `enforce_isolation_mutex`, so that it now
safely allocates reserved VMIDs, which is important for enforcing
isolation between different GPU processes.
Mutex ensures that the process of allocating
On 1/9/2025 8:27 PM, Kim, Jonathan wrote:
> [Public]
>
>> -Original Message-
>> From: Lazar, Lijo
>> Sent: Thursday, January 9, 2025 1:14 AM
>> To: Kim, Jonathan ; amd-gfx@lists.freedesktop.org
>> Cc: Kasiviswanathan, Harish
>> Subject: Re: [PATCH] drm/amdgpu: fix gpu recovery disable
On 1/9/2025 10:36 PM, Alex Deucher wrote:
> On Thu, Jan 9, 2025 at 12:59 AM Lazar, Lijo wrote:
>>
>>
>>
>> On 1/9/2025 4:26 AM, Alex Deucher wrote:
>>> Add helpers to switch the workload profile dynamically when
>>> commands are submitted. This allows us to switch to
>>> the FULLSCREEN3D or CO
> 2025年1月9日 01:19,Mario Limonciello 写道:
>
> On 1/8/2025 07:59, Jiang Liu wrote:
>> Add a flag to track ras debugfs creation status, to avoid possible
>> incorrect reference count management for ras block object in function
>> amdgpu_ras_aca_is_supported().
>
> Rather than taking a marker posi
Disable gfxoff with the compute workload on gfx12. This is a
workaround for the opencl test failure.
Signed-off-by: Kenneth Feng
---
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
b/d
Function detects initialization status by checking sched->ops, so set
sched->ops to non-NULL just before return in function
amdgpu_fence_driver_sw_fini() and amdgpu_device_init_schedulers()
to avoid possible invalid memory access on error recover path.
Signed-off-by: Jiang Liu
---
drivers/gpu/dr
Clear adev->in_suspend flag when fails to suspend, otherwise it will
cause too much warnings like:
[ 1802.212027] [ cut here ]
[ 1802.212028] WARNING: CPU: 97 PID: 11282 at
drivers/gpu/drm/amd/amdgpu/amdgpu_object.c:452 amdgpu_bo_free_kernel+0xf9/0x120
[amdgpu]
[ 1802.2121
If some GPU device failed to probe, `rmmod amdgpu` will trigger a use
after free bug related to amdgpu_driver_release_kms() as:
[16002.085540] BUG: kernel NULL pointer dereference, address:
[16002.093792] #PF: supervisor read access in kernel mode
[16002.03] #PF: error_code(0x0
Introduce new interface amdgpu_xcp_drm_dev_free() to free a specific
drm_device crreated by amdgpu_xcp_drm_dev_alloc(), which will be used
to do error recovery.
Signed-off-by: Jiang Liu
---
drivers/gpu/drm/amd/amdxcp/amdgpu_xcp_drv.c | 65 +
drivers/gpu/drm/amd/amdxcp/amdgpu_
Enhance error handling in function amdgpu_pci_probe() to avoid
possible resource leakage.
Signed-off-by: Jiang Liu
---
drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 12 +---
1 file changed, 9 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
b/drivers/gpu/d
This patchset tries to fix several memory leakages/invalid memory
accesses on error handling path during GPU driver loading/unloading.
They applies to:
https://gitlab.freedesktop.org/agd5f/linux.git amd-staging-drm-next
v4:
1) drop patch 1 in v3
2) split out amdxcp related change into a dedicated
Wang, Yang(Kevin) would like to recall the message, "[PATCH] drm/amdgpu:
disable gfxoff with the compute workload on gfx12".
> 2025年1月8日 17:05,Christian König 写道:
>
> Am 08.01.25 um 09:56 schrieb Jiang Liu:
>> If error happens before amdgpu_fence_driver_hw_init() gets called during
>> device probe, it will trigger a false warning in amdgpu_irq_put() as
>> below:
>> [ 1209.300996] [ cut here ]
[AMD Official Use Only - AMD Internal Distribution Only]
Reviewed-by: Yang Wang
Best Regards,
Kevin
-Original Message-
From: Kenneth Feng
Sent: Thursday, January 9, 2025 16:25
To: amd-gfx@lists.freedesktop.org
Cc: Wang, Yang(Kevin) ; Feng, Kenneth
Subject: [PATCH] drm/amdgpu: disable
On Thu, Jan 9, 2025 at 3:29 PM Borislav Petkov wrote:
>
> Hi folks,
>
> this is rc6 + tip/master, machine is Carrizo laptop.
Possibly fixed by this patch?
https://lore.kernel.org/lkml/CAJZ5v0i=ap+w4QZ8f2DsaHY6D=XUEuSNjyQ-2_=dgolfzjd...@mail.gmail.com/T/
Alex
>
> full dmesg attached.
>
> Thx.
>
On Wed, Jan 8, 2025 at 10:18 AM Kent Russell wrote:
>
> Mark options only meant to be used for debugging as unsafe so that the
> kernel is tainted when they are used.
>
> Signed-off-by: Kent Russell
Reviewed-by: Alex Deucher
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 10 +-
> 1
On 2025-01-08 10:18, Kent Russell wrote:
Mark options only meant to be used for debugging as unsafe so that the
kernel is tainted when they are used.
Signed-off-by: Kent Russell
Reviewed-by: Felix Kuehling
---
drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 10 +-
1 file changed, 5 i
On 2025-01-08 17:55, Xiaogang.Chen wrote:
From: Xiaogang Chen
Current kfd driver has its own PASID value for a kfd process and uses it to
locate vm at interrupt handler or mapping between kfd process and vm. That
design is not working when a physical gpu device has multiple spatial
partitions,
If user shader issues S_SETVSKIP then this state will persist when
executing the trap handler, causing vector instructions to be
skipped.
Restore VSKIP state before resuming the user shader.
Signed-off-by: Jay Cornwall
Cc: Lancelot Six
---
.../gpu/drm/amd/amdkfd/cwsr_trap_handler.h| 2721 +
Source and binary have become mismatched during branch activity.
Signed-off-by: Jay Cornwall
Cc: Lancelot Six
---
.../gpu/drm/amd/amdkfd/cwsr_trap_handler.h| 726 +-
1 file changed, 359 insertions(+), 367 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handle
On Thu, 9 Jan 2025 at 17:58, Felix Kuehling wrote:
>
> From: Christian König
>
> Try pinning into VRAM to allow P2P with RDMA NICs without ODP
> support if all attachments can do P2P. If any attachment can't do
> P2P just pin into GTT instead.
>
> Signed-off-by: Christian König
> Signed-off-by:
On 2025-01-08 20:11, Philip Yang wrote:
On 2025-01-07 22:08, Deng, Emily wrote:
[AMD Official Use Only - AMD Internal Distribution Only]
Hi Philip,
It still has the deadlock, maybe the best way is trying to remove the
delayed free pt work.
[Wed Jan 8 10:35:44 2025 < 0.00>] INF
General note - don't use HTML for mailing list communication.
I'm not sure if Apple Mail lets you switch this around.
If not, you might try using Thunderbird instead. You can pick to reply
in plain text or HTML by holding shift when you hit "reply all"
For my reply I'll convert my reply to p
On Wed, Jan 8, 2025 at 11:17 PM Feng, Kenneth wrote:
>
> [AMD Official Use Only - AMD Internal Distribution Only]
>
> -Original Message-
> From: Deucher, Alexander
> Sent: Thursday, January 9, 2025 6:56 AM
> To: amd-gfx@lists.freedesktop.org
> Cc: Pillai, Aurabindo ; Feng, Kenneth
> ; De
On Thu, Jan 9, 2025 at 12:59 AM Lazar, Lijo wrote:
>
>
>
> On 1/9/2025 4:26 AM, Alex Deucher wrote:
> > Add helpers to switch the workload profile dynamically when
> > commands are submitted. This allows us to switch to
> > the FULLSCREEN3D or COMPUTE profile when work is submitted.
> > Add a del
From: Christian König
Try pinning into VRAM to allow P2P with RDMA NICs without ODP
support if all attachments can do P2P. If any attachment can't do
P2P just pin into GTT instead.
Signed-off-by: Christian König
Signed-off-by: Felix Kuehling
Reviewed-by: Felix Kuehling
Tested-by: Pak Nin Lui
Hi Dave, Simona,
Fixes for 6.13.
The following changes since commit 273b3eb600713a5e71c64b8b403b355dc580f167:
Merge tag 'drm-xe-fixes-2025-01-02' of
https://gitlab.freedesktop.org/drm/xe/kernel into drm-fixes (2025-01-03
10:57:31 +1000)
are available in the Git repository at:
https://git
On Thu, Jan 9, 2025 at 11:17 AM Srinivasan Shanmugam
wrote:
>
> This commit addresses a circular locking dependency issue within the GFX
> isolation mechanism. The problem was identified by a warning indicating
> a potential deadlock due to inconsistent lock acquisition order.
>
> - The `amdgpu_gf
This commit addresses a circular locking dependency issue within the GFX
isolation mechanism. The problem was identified by a warning indicating
a potential deadlock due to inconsistent lock acquisition order.
- The `amdgpu_gfx_enforce_isolation_ring_begin_use` and
`amdgpu_gfx_enforce_isolation_
Am 08.01.25 um 17:30 schrieb Chen, Xiaogang:
On 1/8/2025 3:16 AM, Christian König wrote:
Am 08.01.25 um 09:56 schrieb Jiang Liu:
Function detects initialization status by checking sched->ops,
Where is that done? Inside the scheduler or inside amdgpu?
Inside amdgpu set ring->sched.ops to null
[Public]
> -Original Message-
> From: Lazar, Lijo
> Sent: Thursday, January 9, 2025 1:14 AM
> To: Kim, Jonathan ; amd-gfx@lists.freedesktop.org
> Cc: Kasiviswanathan, Harish
> Subject: Re: [PATCH] drm/amdgpu: fix gpu recovery disable with per queue reset
>
>
>
> On 1/9/2025 1:31 AM, Jona
On 2025-01-09 12:00, Mischa Baars wrote:
> On Mon, Jan 6, 2025 at 4:41 PM Michel Dänzer
> mailto:michel.daen...@mailbox.org>>
> wrote:
>
>> I'm sort of a fan of Michael Abrash, as he inspired me to learn
>> programming assembly language a long time ago, but in his Graphics
>> Programming Black Boo
On Thu, Jan 9, 2025 at 3:58 AM Kenneth Feng wrote:
>
> Disable gfxoff with the compute workload on gfx12. This is a
> workaround for the opencl test failure.
>
> Signed-off-by: Kenneth Feng
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 6 --
> 1 file changed, 4 insertions(+), 2 deleti
As the delayed free pt, the wanted freed bo has been reused, which will cause
unexpected page fault, and then call svm_range_restore_pages.
Detail as below:
1.It wants to free the pt in follow code, but it is not freed immediately
and used schedule_work(&vm->pt_free_work);
[ 92.276838] Call Tra
On 09/01/2025 11:34, Saleemkhan Jamadar wrote:
Introduce db_info structure to the populate the doorbell
information that is required to be mapped.
Made changes to the doorbell mapping func more generic,
by taking parameters that vary based on IPs and/or usecase
into db_info structure.
v2 - Fi
On 09/01/2025 11:34, Saleemkhan Jamadar wrote:
VCN and VPE have different offset range, update the doorbell
offset range repsectively.
Doorbell size for VCN and VPE is 32bit .
v1 : add gfx switch case and fix checkpatch warnings (Shashank)
Signed-off-by: Saleemkhan Jamadar
---
drivers/gpu/
On Mon, Jan 6, 2025 at 4:41 PM Michel Dänzer
wrote:
> Yeah, that's not how double-buffering works in GL. The draw buffer is
always GL_BACK, SwapBuffers doesn't affect that (it just may internally
change which actual buffer GL_BACK refers to).
>
> I don't see more context about the issue you're in
VCN and VPE have different offset range, update the doorbell
offset range repsectively.
Doorbell size for VCN and VPE is 32bit .
v1 : add gfx switch case and fix checkpatch warnings (Shashank)
Signed-off-by: Saleemkhan Jamadar
---
drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c | 24 +
Introduce db_info structure to the populate the doorbell
information that is required to be mapped.
Made changes to the doorbell mapping func more generic,
by taking parameters that vary based on IPs and/or usecase
into db_info structure.
v2 - Fix space alignment and checkpatch warnings(Shashank)
Hi,
Why:
The current implementation of doorbell mapping does not handle
the IP specific doorbell size and offset range. Multiple doorbell
allocation when requested cannot be allocated due hard use of
"struct amdgpu_usermode_queue" parameters. But these parameters
vary for each request of doorbell
On Mon, Jan 6, 2025 at 4:30 AM Mario Limonciello
wrote:
> When new specifications are made available it's not like the old one
> suddenly becomes "open", so I don't see any reason that a new
> specification would change anything.
I paid about €3000 for my new PC, including €300 for the graphics
Disable gfxoff with the compute workload on gfx12. This is a
workaround for the opencl test failure.
Signed-off-by: Kenneth Feng
---
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 6 --
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
b/
Hi Mirsad.
Did you send only this patch, or did I miss patch 1 and 3 of the series? I can't
find them anywhere.
Carlos
On Tue, Dec 17, 2024 at 11:58:12PM +0100, Mirsad Todorovac wrote:
> The source static analysis tool gave the following advice:
>
> ./fs/xfs/libxfs/xfs_dir2.c:382:15-22: WARNING
49 matches
Mail list logo