On 2021/6/16 15:09, Morten Brørup wrote: >> From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Bruce Richardson >> Sent: Tuesday, 15 June 2021 18.39 >> >> On Tue, Jun 15, 2021 at 09:22:07PM +0800, Chengwen Feng wrote: >>> This patch introduces 'dmadevice' which is a generic type of DMA >>> device. >>> >>> The APIs of dmadev library exposes some generic operations which can >>> enable configuration and I/O with the DMA devices. >>> >>> Signed-off-by: Chengwen Feng <fengcheng...@huawei.com> >>> --- >> Thanks for sending this. >> >> Of most interest to me right now are the key data-plane APIs. While we >> are >> still in the prototyping phase, below is a draft of what we are >> thinking >> for the key enqueue/perform_ops/completed_ops APIs. >> >> Some key differences I note in below vs your original RFC: >> * Use of void pointers rather than iova addresses. While using iova's >> makes >> sense in the general case when using hardware, in that it can work >> with >> both physical addresses and virtual addresses, if we change the APIs >> to use >> void pointers instead it will still work for DPDK in VA mode, while >> at the >> same time allow use of software fallbacks in error cases, and also a >> stub >> driver than uses memcpy in the background. Finally, using iova's >> makes the >> APIs a lot more awkward to use with anything but mbufs or similar >> buffers >> where we already have a pre-computed physical address. >> * Use of id values rather than user-provided handles. Allowing the >> user/app >> to manage the amount of data stored per operation is a better >> solution, I >> feel than proscribing a certain about of in-driver tracking. Some >> apps may >> not care about anything other than a job being completed, while other >> apps >> may have significant metadata to be tracked. Taking the user-context >> handles out of the API also makes the driver code simpler. >> * I've kept a single combined API for completions, which differs from >> the >> separate error handling completion API you propose. I need to give >> the >> two function approach a bit of thought, but likely both could work. >> If we >> (likely) never expect failed ops, then the specifics of error >> handling >> should not matter that much. >> >> For the rest, the control / setup APIs are likely to be rather >> uncontroversial, I suspect. However, I think that rather than xstats >> APIs, >> the library should first provide a set of standardized stats like >> ethdev >> does. If driver-specific stats are needed, we can add xstats later to >> the >> API. >> >> Appreciate your further thoughts on this, thanks. >> >> Regards, >> /Bruce > > I generally agree with Bruce's points above. > > I would like to share a couple of ideas for further discussion: > > 1. API for bulk operations. > The ability to prepare a vector of DMA operations, and then post it to the > DMA driver.
We consider bulk operation and final decide not to support: 1. The DMA engine don't applicable to small-packet scenarios which have high PPS. PS: The vector is suitable for high PPS. 2. To support post bulk ops, we need define standard struct like rte_mbuf, and application may nned init the struct field and pass them as pointer array, this may cost too much CPU. 3. The post request was simple than process completed operations, The CPU write performance is also good. ---driver could use vectors to accelerate the process of completed operations. > > 2. Prepare the API for more complex DMA operations than just copy/fill. > E.g. blitter operations like "copy A bytes from the source starting at > address X, to the destination starting at address Y, masked with the bytes > starting at address Z, then skip B bytes at the source and C bytes at the > destination, rewind the mask to the beginning of Z, and repeat D times". This > is just an example. > I'm suggesting to use a "DMA operation" union structure as parameter to the > command enqueue function, rather than having individual functions for each > possible DMA operation. There are many sisution which may hard to define such structure, I prefer separates API like copy/fill/... PS: I saw struct dma_device (Linux dmaengine.h) also support various prep_xxx API. > I know I'm not the only one old enough on the mailing list to have worked > with the Commodore Amiga's blitter. :-) > DPDK has lots of code using CPU vector instructions to shuffle bytes around. > I can easily imagine a DMA engine doing similar jobs, possibly implemented in > an FPGA or some other coprocessor. > > -Morten > > > . >