On 06/04/2018 06:24 PM, Alexei Starovoitov wrote: > On Mon, Jun 04, 2018 at 01:57:10PM +0200, Björn Töpel wrote: >> From: Björn Töpel <bjorn.to...@intel.com> >> >> An issue with the current AF_XDP uapi raised by Mykyta Iziumtsev (see >> https://www.spinics.net/lists/netdev/msg503664.html) is that it does >> not support NICs that have a "type-writer" model in an efficient >> way. In this model, a memory window is passed to the hardware and >> multiple frames might be filled into that window, instead of just one >> that we have in the current fixed frame-size model. >> >> This patch set fixes two bugs in the current implementation and then >> changes the uapi so that the type-writer model can be supported >> efficiently by a possible future extension of AF_XDP. >> >> These are the uapi changes in this patch: >> >> * Change the "u32 idx" in the descriptors to "u64 addr". The current >> idx based format does NOT work for the type-writer model (as packets >> can start anywhere within a frame) but that a relative address >> pointer (the u64 addr) works well for both models in the prototype >> code we have that supports both models. We increased it from u32 to >> u64 to support umems larger than 4G. We have also removed the u16 >> offset when having a "u64 addr" since that information is already >> carried in the least significant bits of the address. >> >> * We want to use "u8 padding[5]" for something useful in the future >> (since we are not allowed to change its name), so we now call it >> just options so it can be extended for various purposes in the >> future. It is an u32 as that it what is left of the 16 byte >> descriptor. >> >> * We changed the name of frame_size in the UMEM_REG setsockopt to >> chunk_size since this naming also makes sense to the type-writer >> model. >> >> With these changes to the uapi, we believe the type-writer model can >> be supported without having to resort to a new descriptor format. The >> type-writer model could then be supported, from the uapi point of >> view, by setting a flag at bind time and providing a new flag bit in >> the options field of the descriptor that signals to user space that >> all packets have been written in a chunk. Or with a new chunk >> completion queue as suggested by Mykyta in his latest feedback mail on >> the list. > > for the set: > Acked-by: Alexei Starovoitov <a...@kernel.org> > Thank you for these fixes. > According to unofficial feedback from brcm and netronome folks > the descriptor format should work for these nics too. > At some point we may consider second format, but I think SW > should drive HW requirements and not the other way around.
LGTM as well, applied to bpf-next, thanks!