Am Montag, 16. Februar 2026, 19:38:19 Mitteleuropäische Normalzeit schrieb Ross Cawston: > The Rocket NPU supports multiple task types: > - Convolutional workloads that use CNA, Core, and DPU blocks > - Standalone post-processing (PPU) tasks such as pooling and element-wise > operations > - Pipelined DPU→PPU workloads > > The current driver has several limitations that prevent correct execution of > non-convolutional workloads and multi-core operation: > > - CNA and Core S_POINTER registers are always initialized, re-arming them > with stale state from previous jobs and corrupting standalone DPU/PPU tasks. > - Completion is hard-coded to wait only for DPU interrupts, causing PPU-only > or DPU→PPU pipeline jobs to time out. > - Ping-pong mode is unconditionally enabled, which is unnecessary for > single-task jobs. > - Non-zero cores hang because the vendor-specific "extra bit" (bit 28 × core > index) in S_POINTER is not set; the BSP sets this via MMIO because userspace > cannot know which core the scheduler will select. > - Timeout and IRQ debugging information is minimal. > > This patch introduces two new per-task fields to struct rocket_task: > > - u32 int_mask: specifies which block completion interrupts signal task done > (DPU_0|DPU_1 for convolutional/standalone DPU, PPU_0|PPU_1 for PPU tasks). > Zero defaults to DPU_0|DPU_1 for backward compatibility. > - u32 flags: currently used for ROCKET_TASK_NO_CNA_CORE to indicate standalone > DPU/PPU tasks that must not touch CNA/Core state. > > Additional changes: > - Only initialize CNA and Core S_POINTER (with the required per-core extra > bit) > when ROCKET_TASK_NO_CNA_CORE is not set. > - Set the per-core extra bit via MMIO to fix hangs on non-zero cores. > - Enable ping-pong mode only when the job contains multiple tasks. > - Mask and clear interrupts according to the task's int_mask. > - Accept both DPU and PPU completion interrupts in the IRQ handler. > - Minor error-path fix in GEM object creation (check error after unlocking > mm_lock). > > These changes, derived from vendor BSP behavior, enable correct execution > of PPU-only tasks, pipelined workloads, and reliable multi-core operation > while preserving backward compatibility.
Missing a Signed-off-by line. Please see https://www.kernel.org/doc/html/latest/process/submitting-patches.html#developer-s-certificate-of-origin-1-1 Heiko > --- > drivers/accel/rocket/rocket_gem.c | 2 + > drivers/accel/rocket/rocket_job.c | 99 +++++++++++++++++++++++++------ > drivers/accel/rocket/rocket_job.h | 2 + > include/uapi/drm/rocket_accel.h | 30 ++++++++++ > 4 files changed, 115 insertions(+), 18 deletions(-)
