Hello, On Fri, Dec 15, 2023 at 4:40 PM Maryam Tahhan <mtah...@redhat.com> wrote: > > With the original 'use_cni' implementation, (using a > hardcoded socket rather than a configurable one), > if a DPDK pod is requesting multiple net devices > and these devices are from different pools, then > the container attempts to mount all the netdev UDSes > in the pod as /tmp/afxdp.sock. Which means that at best > only 1 netdev will handshake correctly with the AF_XDP > DP. This patch addresses this by making the socket > parameter configurable using a new vdev param called > 'uds_path' and removing the previous 'use_cni' param. > This patch also fixes incorrect references to the > AF_XDP DP as CNI and updates the documentation with a > working example. This change has been tested with the
If there are fixes mixed in, please separate them in dedicated patches so we can backport them to LTS releases. > AF_XDP DP PR 81[1], with both single and multiple interfaces. > > [1] https://github.com/intel/afxdp-plugins-for-kubernetes/pull/81 On the patch title, mentioning internals like uds_path does not help a user: it is hard to tell what this change is about at a first glance. Finding a good title is hard, but maybe something like: "net/af_xdp: enhance multiple interfaces support" ? > > v6: > * Add link to PR 81 in commit message > * Add release notes changes to this patchset > > v5: > * Fix alignment for ETH_AF_XDP_USE_DP_UDS_PATH_ARG > * Remove use_cni references in af_xdp.rst > > v4: > * Rename af_xdp_cni.rst to af_xdp_dp.rst > * Removed all incorrect references to CNI throughout af_xdp > PMD file. > * Fixed Typos in af_xdp_dp.rst > > v3: > * Remove `use_cni` vdev argument as it's no longer needed. > * Update incorrect CNI references for the AF_XDP DP in the > documentation. > * Update the documentation to run a simple example with the > AF_XDP DP plugin in K8s. > > v2: > * Rename sock_path to uds_path. > * Update documentation to reflect when CAP_BPF is needed. > * Fix testpmd arguments in the provided example for Pods. > * Use AF_XDP API to update the xskmap entry. This patch history block above has no place in the commitlog. It should be in the annotations part of the patch. https://doc.dpdk.org/guides/contributing/patches.html#creating-patches > > Signed-off-by: Maryam Tahhan <mtah...@redhat.com> > Reviewed-by: Ciara Loftus <ciara.lof...@intel.com> > --- > doc/guides/howto/af_xdp_cni.rst | 253 ---------------------- > doc/guides/howto/af_xdp_dp.rst | 278 +++++++++++++++++++++++++ Renaming the file seems fine to me. However, don't add extra unrelated whitespaces/line wraps changes that makes it hard for git to see it is a rename. For example: $ git show -M05 -- doc/ ... -The standard :doc:`../nics/af_xdp` initialization process involves loading an eBPF program -onto the kernel netdev to be used by the PMD. -This operation requires root or escalated Linux privileges ... +The standard :doc:`../nics/af_xdp` initialization process involves +loading an eBPF program onto the Kernel netdev to be used by the PMD. +This operation requires root or escalated Linux privileges and prevents ... > doc/guides/howto/index.rst | 2 +- > doc/guides/nics/af_xdp.rst | 27 ++- > doc/guides/rel_notes/release_24_03.rst | 7 + > drivers/net/af_xdp/rte_eth_af_xdp.c | 100 +++++---- > 6 files changed, 352 insertions(+), 315 deletions(-) > delete mode 100644 doc/guides/howto/af_xdp_cni.rst > create mode 100644 doc/guides/howto/af_xdp_dp.rst > > diff --git a/doc/guides/howto/af_xdp_cni.rst b/doc/guides/howto/af_xdp_cni.rst > deleted file mode 100644 > index a1a6d5b99c..0000000000 > --- a/doc/guides/howto/af_xdp_cni.rst > +++ /dev/null > @@ -1,253 +0,0 @@ > -.. SPDX-License-Identifier: BSD-3-Clause > - Copyright(c) 2023 Intel Corporation. > - > -Using a CNI with the AF_XDP driver > -================================== > - > -Introduction > ------------- > - > -CNI, the Container Network Interface, is a technology for configuring > -container network interfaces > -and which can be used to setup Kubernetes networking. > -AF_XDP is a Linux socket Address Family that enables an XDP program > -to redirect packets to a memory buffer in userspace. > - > -This document explains how to enable the `AF_XDP Plugin for Kubernetes`_ > within > -a DPDK application using the :doc:`../nics/af_xdp` to connect and use these > technologies. > - > -.. _AF_XDP Plugin for Kubernetes: > https://github.com/intel/afxdp-plugins-for-kubernetes > - > - > -Background > ----------- > - > -The standard :doc:`../nics/af_xdp` initialization process involves loading > an eBPF program > -onto the kernel netdev to be used by the PMD. > -This operation requires root or escalated Linux privileges > -and thus prevents the PMD from working in an unprivileged container. > -The AF_XDP CNI plugin handles this situation > -by providing a device plugin that performs the program loading. > - > -At a technical level the CNI opens a Unix Domain Socket and listens for a > client > -to make requests over that socket. > -A DPDK application acting as a client connects and initiates a configuration > "handshake". > -The client then receives a file descriptor which points to the XSKMAP > -associated with the loaded eBPF program. > -The XSKMAP is a BPF map of AF_XDP sockets (XSK). > -The client can then proceed with creating an AF_XDP socket > -and inserting that socket into the XSKMAP pointed to by the descriptor. > - > -The EAL vdev argument ``use_cni`` is used to indicate that the user wishes > -to run the PMD in unprivileged mode and to receive the XSKMAP file descriptor > -from the CNI. > -When this flag is set, > -the ``XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD`` libbpf flag > -should be used when creating the socket > -to instruct libbpf not to load the default libbpf program on the netdev. > -Instead the loading is handled by the CNI. > - > -.. note:: > - > - The Unix Domain Socket file path appear in the end user is > "/tmp/afxdp.sock". > - > - > -Prerequisites > -------------- > - > -Docker and container prerequisites: > - > -* Set up the device plugin > - as described in the instructions for `AF_XDP Plugin for Kubernetes`_. > - > -* The Docker image should contain the libbpf and libxdp libraries, > - which are dependencies for AF_XDP, > - and should include support for the ``ethtool`` command. > - > -* The Pod should have enabled the capabilities ``CAP_NET_RAW`` and > ``CAP_BPF`` > - for AF_XDP along with support for hugepages. > - > -* Increase locked memory limit so containers have enough memory for packet > buffers. > - For example: > - > - .. code-block:: console > - > - cat << EOF | sudo tee > /etc/systemd/system/containerd.service.d/limits.conf > - [Service] > - LimitMEMLOCK=infinity > - EOF > - > -* dpdk-testpmd application should have AF_XDP feature enabled. > - > - For further information see the docs for the: :doc:`../../nics/af_xdp`. > - > - > -Example > -------- > - > -Howto run dpdk-testpmd with CNI plugin: > - > -* Clone the CNI plugin > - > - .. code-block:: console > - > - # git clone https://github.com/intel/afxdp-plugins-for-kubernetes.git > - > -* Build the CNI plugin > - > - .. code-block:: console > - > - # cd afxdp-plugins-for-kubernetes/ > - # make build > - > - .. note:: > - > - CNI plugin has a dependence on the config.json. > - > - Sample Config.json > - > - .. code-block:: json > - > - { > - "logLevel":"debug", > - "logFile":"afxdp-dp-e2e.log", > - "pools":[ > - { > - "name":"e2e", > - "mode":"primary", > - "timeout":30, > - "ethtoolCmds" : ["-L -device- combined 1"], > - "devices":[ > - { > - "name":"ens785f0" > - } > - ] > - } > - ] > - } > - > - For further reference please use the `config.json`_ > - > - .. _config.json: > https://github.com/intel/afxdp-plugins-for-kubernetes/blob/v0.0.2/test/e2e/config.json > - > -* Create the Network Attachment definition > - > - .. code-block:: console > - > - # kubectl create -f nad.yaml > - > - Sample nad.yml > - > - .. code-block:: yaml > - > - apiVersion: "k8s.cni.cncf.io/v1" > - kind: NetworkAttachmentDefinition > - metadata: > - name: afxdp-e2e-test > - annotations: > - k8s.v1.cni.cncf.io/resourceName: afxdp/e2e > - spec: > - config: '{ > - "cniVersion": "0.3.0", > - "type": "afxdp", > - "mode": "cdq", > - "logFile": "afxdp-cni-e2e.log", > - "logLevel": "debug", > - "ipam": { > - "type": "host-local", > - "subnet": "192.168.1.0/24", > - "rangeStart": "192.168.1.200", > - "rangeEnd": "192.168.1.216", > - "routes": [ > - { "dst": "0.0.0.0/0" } > - ], > - "gateway": "192.168.1.1" > - } > - }' > - > - For further reference please use the `nad.yaml`_ > - > - .. _nad.yaml: > https://github.com/intel/afxdp-plugins-for-kubernetes/blob/v0.0.2/test/e2e/nad.yaml > - > -* Build the Docker image > - > - .. code-block:: console > - > - # docker build -t afxdp-e2e-test -f Dockerfile . > - > - Sample Dockerfile: > - > - .. code-block:: console > - > - FROM ubuntu:20.04 > - RUN apt-get update -y > - RUN apt install build-essential libelf-dev -y > - RUN apt-get install iproute2 acl -y > - RUN apt install python3-pyelftools ethtool -y > - RUN apt install libnuma-dev libjansson-dev libpcap-dev net-tools -y > - RUN apt-get install clang llvm -y > - COPY ./libbpf<version>.tar.gz /tmp > - RUN cd /tmp && tar -xvmf libbpf<version>.tar.gz && cd libbpf/src && > make install > - COPY ./libxdp<version>.tar.gz /tmp > - RUN cd /tmp && tar -xvmf libxdp<version>.tar.gz && cd libxdp && make > install > - > - .. note:: > - > - All the files that need to COPY-ed should be in the same directory as > the Dockerfile > - > -* Run the Pod > - > - .. code-block:: console > - > - # kubectl create -f pod.yaml > - > - Sample pod.yaml: > - > - .. code-block:: yaml > - > - apiVersion: v1 > - kind: Pod > - metadata: > - name: afxdp-e2e-test > - annotations: > - k8s.v1.cni.cncf.io/networks: afxdp-e2e-test > - spec: > - containers: > - - name: afxdp > - image: afxdp-e2e-test:latest > - imagePullPolicy: Never > - env: > - - name: LD_LIBRARY_PATH > - value: /usr/lib64/:/usr/local/lib/ > - command: ["tail", "-f", "/dev/null"] > - securityContext: > - capabilities: > - add: > - - CAP_NET_RAW > - - CAP_BPF > - resources: > - requests: > - hugepages-2Mi: 2Gi > - memory: 2Gi > - afxdp/e2e: '1' > - limits: > - hugepages-2Mi: 2Gi > - memory: 2Gi > - afxdp/e2e: '1' > - > - For further reference please use the `pod.yaml`_ > - > - .. _pod.yaml: > https://github.com/intel/afxdp-plugins-for-kubernetes/blob/v0.0.2/test/e2e/pod-1c1d.yaml > - > -* Run DPDK with a command like the following: > - > - .. code-block:: console > - > - kubectl exec -i <Pod name> --container <containers name> -- \ > - /<Path>/dpdk-testpmd -l 0,1 --no-pci \ > - --vdev=net_af_xdp0,use_cni=1,iface=<interface name> \ > - -- --no-mlockall --in-memory > - > -For further reference please use the `e2e`_ test case in `AF_XDP Plugin for > Kubernetes`_ > - > - .. _e2e: > https://github.com/intel/afxdp-plugins-for-kubernetes/tree/v0.0.2/test/e2e > diff --git a/doc/guides/howto/af_xdp_dp.rst b/doc/guides/howto/af_xdp_dp.rst > new file mode 100644 > index 0000000000..7717d59224 > --- /dev/null > +++ b/doc/guides/howto/af_xdp_dp.rst > @@ -0,0 +1,278 @@ > +.. SPDX-License-Identifier: BSD-3-Clause > + Copyright(c) 2023 Intel Corporation. > + > +Using the AF_XDP Device Plugin with the AF_XDP driver > +====================================================== Too long by one =. > + > +Introduction > +------------ > + > +The `AF_XDP Device Plugin for Kubernetes`_ is a project that provisions > +and advertises interfaces (that can be used with AF_XDP) to Kubernetes. > +The project also includes a `CNI`_. > + > +AF_XDP is a Linux socket Address Family that enables an XDP program > +to redirect packets to a memory buffer in userspace. > + > +This document explains how to use the `AF_XDP Device Plugin for Kubernetes`_ > with > +a DPDK :doc:`../nics/af_xdp` based application running in a Pod. > + > +.. _AF_XDP Device Plugin for Kubernetes: > https://github.com/intel/afxdp-plugins-for-kubernetes > +.. _CNI: https://github.com/containernetworking/cni > + > +Background > +---------- > + > +The standard :doc:`../nics/af_xdp` initialization process involves > +loading an eBPF program onto the Kernel netdev to be used by the PMD. > +This operation requires root or escalated Linux privileges and prevents > +the PMD from working in an unprivileged container. The AF_XDP Device Plugin > (DP) > +addresses this situation by providing an entity that manages eBPF program > +lifecycle for Pod interfaces that wish to use AF_XDP, this in turn allows > +the pod to be used without privilege escalation. > + > +In order for the pod to run without privilege escalation, the AF_XDP DP > +creates a Unix Domain Socket (UDS) and listens for Pods to make requests > +for XSKMAP(s) File Descriptors (FDs) for interfaces in their network > namespace. > +In other words, the DPDK application running in the Pod connects to this UDS > and > +initiates a "handshake" to retrieve the XSKMAP(s) FD(s). Upon a successful > "handshake", > +the DPDK application receives the FD(s) for the XSKMAP(s) associated with > the relevant > +netdevs. The DPDK application can then create the AF_XDP socket(s), and > attach > +the socket(s) to the netdev queue(s) by inserting the socket(s) into the > XSKMAP(s). > + > +The EAL vdev argument ``uds_path`` is used to indicate that the user > +wishes to run the AF_XDP PMD in unprivileged mode and to receive the XSKMAP > +FD from the AF_XDP DP. When this param is used, the > +``XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD`` libbpf flag is used when creating the > +AF_XDP socket to instruct libbpf/libxdp not to load the default eBPF redirect > +program for AF_XDP on the netdev. Instead the lifecycle management of the > eBPF > +program is handled by the AF_XDP DP. > + > +.. note:: > + > + The UDS file path inside the pod appears at > "/tmp/afxdp_dp/<netdev>/afxdp.sock". > + > +Prerequisites > +------------- > + > +Device Plugin and DPDK container prerequisites: > + > +* Create a DPDK container image. > + > +* Set up the device plugin and prepare the Pod Spec as described in > + the instructions for `AF_XDP Device Plugin for Kubernetes`_. > + > +* Increase locked memory limit so containers have enough memory for packet > buffers. > + For example: > + > + .. code-block:: console > + > + cat << EOF | sudo tee > /etc/systemd/system/containerd.service.d/limits.conf > + [Service] > + LimitMEMLOCK=infinity > + EOF > + > +* dpdk-testpmd application should have AF_XDP feature enabled. > + > + For further information see the docs for the: :doc:`../../nics/af_xdp`. > + > + > +Example > +------- > + > +How to run dpdk-testpmd with the AF_XDP Device plugin: > + > +* Clone the AF_XDP Device plugin > + > + .. code-block:: console > + > + # git clone https://github.com/intel/afxdp-plugins-for-kubernetes.git > + > +* Build the AF_XDP Device plugin and the CNI > + > + .. code-block:: console > + > + # cd afxdp-plugins-for-kubernetes/ > + # make image > + > +* Make sure to modify the image used by the `daemonset.yml`_ file in the > deployments directory with > + the following configuration: > + > + .. _daemonset.yml : > https://github.com/intel/afxdp-plugins-for-kubernetes/blob/main/deployments/daemonset.yml > + > + .. code-block:: yaml > + > + image: afxdp-device-plugin:latest > + > + .. note:: > + > + This will select the AF_XDP DP image that was built locally. Detailed > configuration > + options can be found in the AF_XDP Device Plugin `readme`_ . > + > + .. _readme: https://github.com/intel/afxdp-plugins-for-kubernetes#readme > + > +* Deploy the AF_XDP Device Plugin and CNI > + > + .. code-block:: console > + > + # kubectl create -f deployments/daemonset.yml > + > +* Create a Network Attachment Definition (NAD) > + > + .. code-block:: console > + > + # kubectl create -f nad.yaml > + > + Sample nad.yml > + > + .. code-block:: yaml > + > + apiVersion: "k8s.cni.cncf.io/v1" > + kind: NetworkAttachmentDefinition > + metadata: > + name: afxdp-network > + annotations: > + k8s.v1.cni.cncf.io/resourceName: afxdp/myPool > + spec: > + config: '{ > + "cniVersion": "0.3.0", > + "type": "afxdp", > + "mode": "primary", > + "logFile": "afxdp-cni.log", > + "logLevel": "debug", > + "ethtoolCmds" : ["-N -device- rx-flow-hash udp4 fn", > + "-N -device- flow-type udp4 dst-port 2152 action > 22" > + ], > + "ipam": { > + "type": "host-local", > + "subnet": "192.168.1.0/24", > + "rangeStart": "192.168.1.200", > + "rangeEnd": "192.168.1.220", > + "routes": [ > + { "dst": "0.0.0.0/0" } > + ], > + "gateway": "192.168.1.1" > + } > + }' > + > + For further reference please use the example provided by the AF_XDP DP > `nad.yaml`_ > + > + .. _nad.yaml: > https://github.com/intel/afxdp-plugins-for-kubernetes/blob/main/examples/network-attachment-definition.yaml > + > +* Build a DPDK container image (using Docker) > + > + .. code-block:: console > + > + # docker build -t dpdk -f Dockerfile . > + > + Sample Dockerfile (should be placed in top level DPDK directory): > + > + .. code-block:: console > + > + FROM fedora:38 > + > + # Setup container to build DPDK applications > + RUN dnf -y upgrade && dnf -y install \ > + libbsd-devel \ > + numactl-libs \ > + libbpf-devel \ > + libbpf \ > + meson \ > + ninja-build \ > + libxdp-devel \ > + libxdp \ > + numactl-devel \ > + python3-pyelftools \ > + python38 \ > + iproute > + RUN dnf groupinstall -y 'Development Tools' > + > + # Create DPDK dir and copy over sources > + WORKDIR /dpdk > + COPY app app > + COPY builddir builddir > + COPY buildtools buildtools > + COPY config config > + COPY devtools devtools > + COPY drivers drivers > + COPY dts dts > + COPY examples examples > + COPY kernel kernel > + COPY lib lib > + COPY license license > + COPY MAINTAINERS MAINTAINERS > + COPY Makefile Makefile > + COPY meson.build meson.build > + COPY meson_options.txt meson_options.txt > + COPY usertools usertools > + COPY VERSION VERSION > + COPY ABI_VERSION ABI_VERSION > + COPY doc doc > + > + # Build DPDK > + RUN meson setup build > + RUN ninja -C build > + > + .. note:: > + > + Ensure the Dockerfile is placed in the top level DPDK directory. > + > +* Run the Pod > + > + .. code-block:: console > + > + # kubectl create -f pod.yaml > + > + Sample pod.yaml: > + > + .. code-block:: yaml > + > + apiVersion: v1 > + kind: Pod > + metadata: > + name: dpdk > + annotations: > + k8s.v1.cni.cncf.io/networks: afxdp-network > + spec: > + containers: > + - name: testpmd > + image: dpdk:latest > + command: ["tail", "-f", "/dev/null"] > + securityContext: > + capabilities: > + add: > + - NET_RAW > + - IPC_LOCK > + resources: > + requests: > + afxdp/myPool: '1' > + limits: > + hugepages-1Gi: 2Gi > + cpu: 2 > + memory: 256Mi > + afxdp/myPool: '1' > + volumeMounts: > + - name: hugepages > + mountPath: /dev/hugepages > + volumes: > + - name: hugepages > + emptyDir: > + medium: HugePages > + > + For further reference please use the `pod.yaml`_ > + > + .. _pod.yaml: > https://github.com/intel/afxdp-plugins-for-kubernetes/blob/main/examples/pod-spec.yaml > + > +.. note:: > + > + For Kernel versions older than 5.19 `CAP_BPF` is also required in > + the container capabilities stanza. > + > +* Run DPDK with a command like the following: > + > + .. code-block:: console > + > + kubectl exec -i dpdk --container testpmd -- \ > + ./build/app/dpdk-testpmd -l 0-2 --no-pci --main-lcore=2 \ > + --vdev net_af_xdp,iface=<interface > name>,start_queue=22,queue_count=1,uds_path=/tmp/afxdp_dp/<interface-name>/afxdp.sock > \ > + -- -i --a --nb-cores=2 --rxq=1 --txq=1 --forward-mode=macswap; > diff --git a/doc/guides/howto/index.rst b/doc/guides/howto/index.rst > index 71a3381c36..a7692e8a97 100644 > --- a/doc/guides/howto/index.rst > +++ b/doc/guides/howto/index.rst > @@ -8,7 +8,7 @@ HowTo Guides > :maxdepth: 2 > :numbered: > > - af_xdp_cni > + af_xdp_dp > lm_bond_virtio_sriov > lm_virtio_vhost_user > flow_bifurcation > diff --git a/doc/guides/nics/af_xdp.rst b/doc/guides/nics/af_xdp.rst > index 1932525d4d..0edc84328d 100644 > --- a/doc/guides/nics/af_xdp.rst > +++ b/doc/guides/nics/af_xdp.rst > @@ -151,25 +151,32 @@ instead of zero copy mode (if available). > > --vdev net_af_xdp,iface=ens786f1,force_copy=1 > > -use_cni > -~~~~~~~ > +uds_path > +~~~~~~~~~ Too long by one ~. > > -The EAL vdev argument ``use_cni`` is used to indicate that the user wishes to > -enable the `AF_XDP Plugin for Kubernetes`_ within a DPDK application. > +The EAL vdev argument ``uds_path`` is used to indicate that the user wishes > to > +use the `AF_XDP Plugin for Kubernetes`_ with a DPDK application running in a > Pod. > > .. _AF_XDP Plugin for Kubernetes: > https://github.com/intel/afxdp-plugins-for-kubernetes > > .. code-block:: console > > - --vdev=net_af_xdp0,use_cni=1 > + --vdev=net_af_xdp0,uds_path==/tmp/afxdp_dp/<interface-name>/afxdp.sock I suppose we only need one =. > > .. note:: > > - When using `use_cni`_, both parameters `xdp_prog`_ and `busy_budget`_ are > disabled > - as both of these will be handled by the AF_XDP plugin. > - Since the DPDK application is running in limited privileges > - so enabling and disabling of the promiscuous mode through the DPDK > application > - is also not supported. > + The UDS ``afxdp.sock`` is available in the DPDK container through a > + volume mounted by the `AF_XDP Plugin for Kubernetes`_ at the path > + specified in the example above. > + > +.. note:: > + > + When using `uds_path`_, both parameters `xdp_prog`_ and `busy_budget`_ > are disabled > + as both of these will be handled by the AF_XDP Device plugin (if > required). > + Since the pod/container is running with limited privileges enabling and > disabling > + of promiscuous mode through the DPDK application is also not supported. > + > +For more details please see: :doc:`../howto/af_xdp_dp` > > Limitations > ----------- > diff --git a/doc/guides/rel_notes/release_24_03.rst > b/doc/guides/rel_notes/release_24_03.rst > index 6f8ad27808..606cdf6316 100644 > --- a/doc/guides/rel_notes/release_24_03.rst > +++ b/doc/guides/rel_notes/release_24_03.rst > @@ -55,6 +55,13 @@ New Features > Also, make sure to start the actual text at the margin. > ======================================================= > > +* **Integration of AF_XDP PMD with AF_XDP Device Plugin** Quoting the comments a few lines before in this doc: * **Add a title in the past tense with a full stop.** > + > + The EAL vdev argument for the AF_XDP PMD ``uds_path`` was added > + to allow Kubernetes Pods that which to use AF_XDP with DPDK to run > + with limited privileges. This flag indicates that the AF_XDP PMD > + will be used in unprivileged mode and will receive the XSKMAP FD from > + the AF_XDP Device Plugin. And double empty line before a new section in the RN. > > Removed Items > ------------- Thanks. -- David Marchand