On Thu, Mar 28, 2024 at 10:01 AM Dejia Shang <dejia.sh...@armchina.com> wrote: > > Dear Kernel Maintainers, > > I am a driver developer and would like to upstream the ArmChina Zhouyi NPU > driver ("Zhouyi" is the brand) to accel subsystem. > > The driver is already open sourced (both UMD and KMD) and anyone can find the > code from https://github.com/Arm-China/Compass_NPU_Driver.git. > > This driver is responsible for scheduling AI inference tasks to the NPU cores > (V1/V2/V3). Specifically, a simplified end-to-end flow is: > > 1. A TFLite/ONNX model is transformed to an executable binary file in > ELF format by the NN graph compiler (designed by ArmChina) > 2. An application loads the executable binary file to UMD and > provides the input data. > 3. UMD parses the binary and sends ioctls to KMD (open device, do > memory allocation/mmap/free, submit the job descriptor). > 4. KMD dispatches the job to NPU h/w, handles interrupts and updates > the execution status. > 5. UMD polls the status of the pre-scheduled job. > 6. The application gets the output results. > > So...for the upstreaming, > > Q1: do you think our NPU driver is suitable for accel? If the answer is yes, > which tree & branch should the patches be based on? Hi Dejia, Yes, it definitely sounds as a good fit to the accel subsystem. Please base your patches on "drm-misc-next" branch in drm-misc repo: https://anongit.freedesktop.org/git/drm/drm-misc.git
> > Q2: in thread > https://lore.kernel.org/lkml/ec547d33-214f-4952-aa33-c271e9eda...@kernel.org/ > showing a similar case, Oded mentioned that: > > "If we would have upstreamed a new driver, the expectation would have > been that we would use some drm mechanisms.", and > "the minimal requirement is to use GEM/BOs for memory management > operations". > > I guess those requirements are also applicable for the Zhouyi NPU KMD? > Currently, the memory management (MM) in KMD is based on dma-mapping APIs, > which handles both reserved CMA region(s) and SMMU mapped buffers, and > supports the dma-buf framework. Maybe I should replace the implementations > with DRM APIs. Yes, those requirements definitely apply here. > > Q3: if you have looked at the KMD code, do you think I should make any other > major change before submitting the first patch series? Thank you! I took a quick glance. In general, it seems to be ok, but I noticed two things related to the integration with drm/accel: 1. You us a scheduler for the job submission, which provides the ability to defer jobs. In that case, I suggest to check if you can use drm_sched instead of your own implementation. No point in re-inventing the wheel. 2. You provide several memory zones for allocation of memory. I would suggest here to look at using ttm as the memory manager instead of re-implementing your own. And please remove the IMPORTANT NOTICE at the end of your emails. I would have to refrain from answering to further emails if that notice remains. Thanks, Oded > > Thanks for your time and look forward to your reply~ 😊 > > Best Regards, > Dejia > IMPORTANT NOTICE: The contents of this email and any attachments may be > privileged and confidential. If you are not the intended recipient, please > delete the email immediately. It is strictly prohibited to disclose the > contents to any other person, use it for any purpose, or store or copy the > information in any medium. Thank you. ©Arm Technology (China) Co., Ltd > copyright and reserve all rights. > 重要提示:本邮件(包括任何附件)可能含有专供明确的个人或目的使用的机密信息,并受法律保护。如果您并非该收件人,请立即删除此邮件。严禁通过任何渠道,以任何目的,向任何人披露、储存或复制邮件信息或者据此采取任何行动。感谢您的配合。 > ©安谋科技(中国)有限公司 版权所有并保留一切权利。