> -----Original Message----- > From: Ferruh Yigit <ferruh.yi...@amd.com> > Sent: Friday, February 10, 2023 1:04 PM > To: Koikkara Reeny, Shibin <shibin.koikkara.re...@intel.com>; > dev@dpdk.org; Zhang, Qi Z <qi.z.zh...@intel.com>; Burakov, Anatoly > <anatoly.bura...@intel.com>; Richardson, Bruce > <bruce.richard...@intel.com>; Mcnamara, John > <john.mcnam...@intel.com> > Cc: Loftus, Ciara <ciara.lof...@intel.com> > Subject: Re: [PATCH v4] net/af_xdp: AF_XDP PMD CNI Integration > > On 2/9/2023 12:05 PM, Shibin Koikkara Reeny wrote: > > Integrate support for the AF_XDP CNI and device plugin [1] so that the > > DPDK AF_XDP PMD can work in an unprivileged container environment. > > Part of the AF_XDP PMD initialization process involves loading an eBPF > > program onto the given netdev. This operation requires privileges, > > which prevents the PMD from being able to work in an unprivileged > > container (without root access). The plugin CNI handles the program > > loading. CNI open Unix Domain Socket (UDS) and waits listening for a > > client to make requests over that UDS. The client(DPDK) connects and a > > "handshake" occurs, then the File Descriptor which points to the > > XSKMAP associated with the loaded eBPF program is handed over to the > > client. The client can then proceed with creating an AF_XDP socket and > > inserting the socket into the XSKMAP pointed to by the FD received on > > the UDS. > > > > A new vdev arg "use_cni" is created to indicate user wishes to run the > > PMD in unprivileged mode and to receive the XSKMAP FD from the CNI. > > When this flag is set, the XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD libbpf > > flag should be used when creating the socket, which tells libbpf not > > to load the default libbpf program on the netdev. We tell libbpf not > > to do this because the loading is handled by the CNI in this scenario. > > > > Patch include howto doc explain how to configure AF_XDP CNI to working > > with DPDK. > > > > [1]: https://github.com/intel/afxdp-plugins-for-kubernetes > > > > Signed-off-by: Shibin Koikkara Reeny <shibin.koikkara.re...@intel.com> > > > Is Anatoly's tested-by tag still valid with this version?
Yes it is still valid. > > <...> > > > @@ -1413,7 +1678,23 @@ xsk_configure(struct pmd_internals *internals, > struct pkt_rx_queue *rxq, > > } > > } > > > > - if (rxq->busy_budget) { > > + if (internals->use_cni) { > > + int err, fd, map_fd; > > + > > + /* get socket fd from CNI plugin */ > > + map_fd = get_cni_fd(internals->if_name); > > + if (map_fd < 0) { > > + AF_XDP_LOG(ERR, "Failed to receive CNI plugin > fd\n"); > > + goto out_xsk; > > + } > > + /* get socket fd */ > > + fd = xsk_socket__fd(rxq->xsk); > > + err = bpf_map_update_elem(map_fd, &rxq- > >xsk_queue_idx, &fd, 0); > > + if (err) { > > + AF_XDP_LOG(ERR, "Failed to insert unprivileged xsk > in map.\n"); > > + goto out_xsk; > > + } > > + } else if (rxq->busy_budget) { > > > 'use_cni' argument is added as if-else, this result 'use_cni' parameter > automatically makes 'busy_budget' argument ineffective, is this intentional? > If so can you please describe why? > And can you please document this in the driver documentation that 'use_cni' > and 'busy_budget' paramters are mutually exclusive. > May be this condition can be checked and an error message sent in runtime, > not sure. > When we use "use_cni" option inorder to configure the busy_budget we need to send the request to the CNI plugin and CNI plugin will configure the busy_poll. As the dpdk is running inside a container with limited permissions. > > Similarly, another parameter check above this (not visible in this patch), > xdp_prog (custom_prog_configured) is calling same APIs > (bpf_map_update_elem()), if both paramters are provided, 'use_cni' will > overwrite previous one, is this intentional? > Are 'use_cni' & 'xdp_prog' paramters mutually exclusive? When we use "use_cni" we don't have the permission to load the xdp_prog. As our privileges are limited inside the container. CNI plugin handle the loading of the program. > > > Overall is the combination of 'use_cni' paramter with other parameters > tested? We have tested the communication with CNI plugin which load the program and traffic flow. > > > > ret = configure_preferred_busy_poll(rxq); > > if (ret) { > > AF_XDP_LOG(ERR, "Failed configure busy > polling.\n"); @@ -1584,6 > > +1865,27 @@ static const struct eth_dev_ops ops = { > > .get_monitor_addr = eth_get_monitor_addr, }; > > > > +/* CNI option works in unprivileged container environment > > + * and ethernet device functionality will be reduced. So > > + * additional customiszed eth_dev_ops struct is needed > > + * for cni. Promiscuous enable and disable functionality > > + * is removed. > > > Why promiscuous enable and disable functionality can't be used with > 'use_cni'? When we use "use_cni" we are running dpdk_testpmd inside a docker and inside the docker we have only limited permissions only ie the reason I have written it as "unprivileged container environment" it the comment. > > Can you please document the limitation in the driver document, also if > possible briefly mention reason of the limitation? In the documentation as prerequisites we have added : +* The Pod should have enabled the capabilities ``CAP_NET_RAW`` and ``CAP_BPF`` + for AF_XDP along with support for hugepages. In the Background: +The standard `AF_XDP PMD`_ initialization process involves loading an eBPF program +onto the kernel netdev to be used by the PMD. This operation requires root or +escalated Linux privileges and thus prevents the PMD from working in an +unprivileged container. The AF_XDP CNI plugin handles this situation by +providing a device plugin that performs the program loading. If you think we need to add more please let me know.