On Thu, Mar 28, 2024 at 10:01 AM Dejia Shang <dejia.sh...@armchina.com> wrote:
>
> Dear Kernel Maintainers,
>
> I am a driver developer and would like to upstream the ArmChina Zhouyi NPU 
> driver ("Zhouyi" is the brand) to accel subsystem.
>
> The driver is already open sourced (both UMD and KMD) and anyone can find the 
> code from https://github.com/Arm-China/Compass_NPU_Driver.git.
>
> This driver is responsible for scheduling AI inference tasks to the NPU cores 
> (V1/V2/V3). Specifically, a simplified end-to-end flow is:
>
>         1. A TFLite/ONNX model is transformed to an executable binary file in 
> ELF format by the NN graph compiler (designed by ArmChina)
>         2. An application loads the executable binary file to UMD and 
> provides the input data.
>         3. UMD parses the binary and sends ioctls to KMD (open device, do 
> memory allocation/mmap/free, submit the job descriptor).
>         4. KMD dispatches the job to NPU h/w, handles interrupts and updates 
> the execution status.
>         5. UMD polls the status of the pre-scheduled job.
>         6. The application gets the output results.
>
> So...for the upstreaming,
>
> Q1: do you think our NPU driver is suitable for accel? If the answer is yes, 
> which tree & branch should the patches be based on?
Hi Dejia,
Yes, it definitely sounds as a good fit to the accel subsystem.
Please base your patches on "drm-misc-next" branch in drm-misc repo:
https://anongit.freedesktop.org/git/drm/drm-misc.git

>
> Q2: in thread 
> https://lore.kernel.org/lkml/ec547d33-214f-4952-aa33-c271e9eda...@kernel.org/ 
> showing a similar case, Oded mentioned that:
>
>         "If we would have upstreamed a new driver, the expectation would have 
> been that we would use some drm mechanisms.", and
>         "the minimal requirement is to use GEM/BOs for memory management 
> operations".
>
> I guess those requirements are also applicable for the Zhouyi NPU KMD? 
> Currently, the memory management (MM) in KMD is based on dma-mapping APIs, 
> which handles both reserved CMA region(s) and SMMU mapped buffers, and 
> supports the dma-buf framework. Maybe I should replace the implementations 
> with DRM APIs.
Yes, those requirements definitely apply here.
>
> Q3: if you have looked at the KMD code, do you think I should make any other 
> major change before submitting the first patch series? Thank you!
I took a quick glance. In general, it seems to be ok, but I noticed
two things related to the integration with drm/accel:

1. You us a scheduler for the job submission, which provides the
ability to defer jobs. In that case, I suggest to check if you can use
drm_sched instead of your own implementation. No point in re-inventing
the wheel.
2. You provide several memory zones for allocation of memory. I would
suggest here to look at using ttm as the memory manager instead of
re-implementing your own.

And please remove the IMPORTANT NOTICE at the end of your emails. I
would have to refrain from answering to further emails if that notice
remains.

Thanks,
Oded

>
> Thanks for your time and look forward to your reply~ 😊
>
> Best Regards,
> Dejia
> IMPORTANT NOTICE: The contents of this email and any attachments may be 
> privileged and confidential. If you are not the intended recipient, please 
> delete the email immediately. It is strictly prohibited to disclose the 
> contents to any other person, use it for any purpose, or store or copy the 
> information in any medium. Thank you. ©Arm Technology (China) Co., Ltd 
> copyright and reserve all rights. 
> 重要提示:本邮件(包括任何附件)可能含有专供明确的个人或目的使用的机密信息,并受法律保护。如果您并非该收件人,请立即删除此邮件。严禁通过任何渠道,以任何目的,向任何人披露、储存或复制邮件信息或者据此采取任何行动。感谢您的配合。
>  ©安谋科技(中国)有限公司 版权所有并保留一切权利。

Reply via email to