On 5/16/18 11:46 PM, Björn Töpel wrote:
2018-05-04 1:38 GMT+02:00 Alexei Starovoitov <alexei.starovoi...@gmail.com>:
On Fri, May 04, 2018 at 12:49:09AM +0200, Daniel Borkmann wrote:
On 05/02/2018 01:01 PM, Björn Töpel wrote:
From: Björn Töpel <bjorn.to...@intel.com>

This patch set introduces a new address family called AF_XDP that is
optimized for high performance packet processing and, in upcoming
patch sets, zero-copy semantics. In this patch set, we have removed
all zero-copy related code in order to make it smaller, simpler and
hopefully more review friendly. This patch set only supports copy-mode
for the generic XDP path (XDP_SKB) for both RX and TX and copy-mode
for RX using the XDP_DRV path. Zero-copy support requires XDP and
driver changes that Jesper Dangaard Brouer is working on. Some of his
work has already been accepted. We will publish our zero-copy support
for RX and TX on top of his patch sets at a later point in time.

+1, would be great to see it land this cycle. Saw few minor nits here
and there but nothing to hold it up, for the series:

Acked-by: Daniel Borkmann <dan...@iogearbox.net>

Thanks everyone!

Great stuff!

Applied to bpf-next, with one condition.
Upcoming zero-copy patches for both RX and TX need to be posted
and reviewed within this release window.
If netdev community as a whole won't be able to agree on the zero-copy
bits we'd need to revert this feature before the next merge window.

Few other minor nits:
patch 3:
+struct xdp_ring {
+       __u32 producer __attribute__((aligned(64)));
+       __u32 consumer __attribute__((aligned(64)));
+};
It kinda begs for ____cacheline_aligned_in_smp to be introduced for uapi 
headers.


Hmm, I need some guidance on what a sane uapi variant would be. We
can't have the uapi depend on the kernel build. ARM64, e.g., can have
both 64B and 128B according to the specs. Contemporary IA processors
have 64B.

The simplest, and maybe most future-proof, would be 128B aligned for
all. Another is having 128B for ARM and 64B for all IA. A third option
is having a hand-shaking API (I think virtio has that) for determine
the cache line size, but I'd rather not go down that route.

Thoughts/ideas on how a uapi ____cacheline_aligned_in_smp version
would look like?

I suspect i40e+arm combination wasn't tested anyway.
The api may have endianness issues too on something like sparc.
I think the way to be backwards compatible in this area
is to make the api usable on x86 only by adding
to include/uapi/linux/if_xdp.h
#if defined(__x86_64__)
#define AF_XDP_CACHE_BYTES 64
#else
#error "AF_XDP support is not yet available for this architecture"
#endif
and doing:
    __u32 producer __attribute__((aligned(AF_XDP_CACHE_BYTES)));
    __u32 consumer __attribute__((aligned(AF_XDP_CACHE_BYTES)));

And progressively add to this for arm64 and few other archs.
Eventually removing #error and adding some generic define
that's good enough for long tail of architectures that
we really cannot test.

Reply via email to