Am 01.08.24 um 08:53 schrieb Marek Olšák:
On Thu, Aug 1, 2024, 00:28 Khatri, Sunil <sukha...@amd.com> wrote:
On 8/1/2024 8:49 AM, Marek Olšák wrote:
>> + /* Header is at index 0, followed by num_nops - 1 NOP
packet's */
>> + for (i = 1; i < num_nop; i++)
>> + amdgpu_ring_write(ring, ring->funcs->nop);
> This loop should be removed. It's unnecessary CPU overhead and we
> should never get more than 0x3fff NOPs (maybe use BUG_ON).
Leaving the
> whole packet body uninitialized is the fastest option.
That was the original intent to just move the WPTR for the no of nops
and tried too. Based on Christian inputs we should not let the
nops packet
as garbage or whatever was there originally as a threat/safety
measure.
It doesn't help safety. It can only be read by the GPU with
kernel-level permissions.
Initializing the packet body is useless and adds CPU overhead,
especially with the 256 NOPs or so that we use for no reason.
Not filling the remaining ring buffers with NOPs is a pretty clear NAK
from my side. Leaving garbage in the ring buffer is not even remotely
defensive.
What we can do is to optimize filling N DWs into the ring buffer without
updating the WPTR each time.
Regards,
Christian.
Marek