Re: [PATCH 09/12] drm/amdgpu: Optimise amdgpu_ring_write()

2025-01-04 Thread Tvrtko Ursulin
On 02/01/2025 13:55, Christian König wrote: Am 27.12.24 um 12:19 schrieb Tvrtko Ursulin: From: Tvrtko Ursulin There are more than 2000 calls to amdgpu_ring_write() in the driver and the majority is multiple sequential calls which the compiler cannot optimise much. Lets make this helper vari

Re: [PATCH 4/6] amdgpu: fix use after free bug related to amdgpu_driver_release_kms()

2025-01-04 Thread Gerry Liu
> 2025年1月4日 01:34,Chen, Xiaogang 写道: > > > On 1/3/2025 1:43 AM, Shuo Liu wrote: >> On Fri 3.Jan'25 at 15:02:38 +0800, Gerry Liu wrote: >>> >>> 2025年1月3日 13:58,Chen, Xiaogang 写道: On 1/1/2025 11:36 PM, Jiang Liu wrote: > If some GPU device failed to probe, `rm

Re: [PATCH 04/12] drm/amdgpu: Consolidate a bunch of similar sdma insert nop vfuncs

2025-01-04 Thread Tvrtko Ursulin
On 02/01/2025 13:49, Christian König wrote: Am 27.12.24 um 12:19 schrieb Tvrtko Ursulin: From: Tvrtko Ursulin A lot of the hardware generations apparently uses the same nop insertion logic, just with different masks and shifts. We can consolidate if we store those shifts and mask in the rin

Re: [PATCH 01/12] drm/amdgpu: Use memset32 for IB padding

2025-01-04 Thread Tvrtko Ursulin
On 02/01/2025 13:45, Christian König wrote: Am 27.12.24 um 12:19 schrieb Tvrtko Ursulin: From: Tvrtko Ursulin Use memset32 instead of open coding it, just because it is that bit nicer. In general looks mostly good, my only concern is that we already had to switch to memset_io() on some

Re: [PATCH 2/6] amdgpu: fix invalid memory access in kfd_cleanup_nodes()

2025-01-04 Thread Gerry Liu
> 2025年1月4日 01:33,Chen, Xiaogang 写道: > > > > On 1/3/2025 1:05 AM, Gerry Liu wrote: >> >> >>> 2025年1月3日 14:19,Chen, Xiaogang >> > 写道: >>> >>> >>> >>> On 1/2/2025 11:55 PM, Gerry Liu wrote: > 2025年1月3日 13:44,Chen, Xiaogang

Re: [PATCH v6 0/9] Add jump table support for objtool on LoongArch

2025-01-04 Thread Huacai Chen
Hi, Josh and Peter, I think this series (except the last patch, but that one can be a separate one) is good enough now, right? If so, I think there is some ways to get it upstream: 1) I merge objtool/core from tip.git to the loongarch tree, then apply this whole series with your acked-by; 2) You

[PATCH v2] drm/amdgpu/gfx10: Enable cleaner shader for GFX10.3.2/10.3.4/10.3.5 GPUs

2025-01-04 Thread Srinivasan Shanmugam
Enable the cleaner shader for GFX10.3.2/10.3.4/10.3.5 GPUs to provide data isolation between GPU workloads. The cleaner shader is responsible for clearing the Local Data Store (LDS), Vector General Purpose Registers (VGPRs), and Scalar General Purpose Registers (SGPRs), which helps prevent data lea

[PATCH] drm/amdgpu/gfx10: Enable cleaner shader for GFX10.3.2/10.3.4/10.3.5 GPUs

2025-01-04 Thread Srinivasan Shanmugam
Enable the cleaner shader for GFX10.3.2/10.3.4/10.3.5 GPUs to provide data isolation between GPU workloads. The cleaner shader is responsible for clearing the Local Data Store (LDS), Vector General Purpose Registers (VGPRs), and Scalar General Purpose Registers (SGPRs), which helps prevent data lea