Am Montag, 16. Februar 2026, 19:38:19 Mitteleuropäische Normalzeit schrieb Ross 
Cawston:
> The Rocket NPU supports multiple task types:
> - Convolutional workloads that use CNA, Core, and DPU blocks
> - Standalone post-processing (PPU) tasks such as pooling and element-wise 
> operations
> - Pipelined DPU→PPU workloads
> 
> The current driver has several limitations that prevent correct execution of
> non-convolutional workloads and multi-core operation:
> 
> - CNA and Core S_POINTER registers are always initialized, re-arming them
>   with stale state from previous jobs and corrupting standalone DPU/PPU tasks.
> - Completion is hard-coded to wait only for DPU interrupts, causing PPU-only
>   or DPU→PPU pipeline jobs to time out.
> - Ping-pong mode is unconditionally enabled, which is unnecessary for
>   single-task jobs.
> - Non-zero cores hang because the vendor-specific "extra bit" (bit 28 × core
>   index) in S_POINTER is not set; the BSP sets this via MMIO because userspace
>   cannot know which core the scheduler will select.
> - Timeout and IRQ debugging information is minimal.
> 
> This patch introduces two new per-task fields to struct rocket_task:
> 
> - u32 int_mask: specifies which block completion interrupts signal task done
>   (DPU_0|DPU_1 for convolutional/standalone DPU, PPU_0|PPU_1 for PPU tasks).
>   Zero defaults to DPU_0|DPU_1 for backward compatibility.
> - u32 flags: currently used for ROCKET_TASK_NO_CNA_CORE to indicate standalone
>   DPU/PPU tasks that must not touch CNA/Core state.
> 
> Additional changes:
> - Only initialize CNA and Core S_POINTER (with the required per-core extra 
> bit)
>   when ROCKET_TASK_NO_CNA_CORE is not set.
> - Set the per-core extra bit via MMIO to fix hangs on non-zero cores.
> - Enable ping-pong mode only when the job contains multiple tasks.
> - Mask and clear interrupts according to the task's int_mask.
> - Accept both DPU and PPU completion interrupts in the IRQ handler.
> - Minor error-path fix in GEM object creation (check error after unlocking
>   mm_lock).
> 
> These changes, derived from vendor BSP behavior, enable correct execution
> of PPU-only tasks, pipelined workloads, and reliable multi-core operation
> while preserving backward compatibility.

Missing a Signed-off-by line.

Please see
https://www.kernel.org/doc/html/latest/process/submitting-patches.html#developer-s-certificate-of-origin-1-1


Heiko

> ---
>  drivers/accel/rocket/rocket_gem.c |  2 +
>  drivers/accel/rocket/rocket_job.c | 99 +++++++++++++++++++++++++------
>  drivers/accel/rocket/rocket_job.h |  2 +
>  include/uapi/drm/rocket_accel.h   | 30 ++++++++++
>  4 files changed, 115 insertions(+), 18 deletions(-)



Reply via email to