On 2/10/2023 3:38 PM, Koikkara Reeny, Shibin wrote: > > >> -----Original Message----- >> From: Ferruh Yigit <ferruh.yi...@amd.com> >> Sent: Friday, February 10, 2023 1:04 PM >> To: Koikkara Reeny, Shibin <shibin.koikkara.re...@intel.com>; >> dev@dpdk.org; Zhang, Qi Z <qi.z.zh...@intel.com>; Burakov, Anatoly >> <anatoly.bura...@intel.com>; Richardson, Bruce >> <bruce.richard...@intel.com>; Mcnamara, John >> <john.mcnam...@intel.com> >> Cc: Loftus, Ciara <ciara.lof...@intel.com> >> Subject: Re: [PATCH v4] net/af_xdp: AF_XDP PMD CNI Integration >> >> On 2/9/2023 12:05 PM, Shibin Koikkara Reeny wrote: >>> Integrate support for the AF_XDP CNI and device plugin [1] so that the >>> DPDK AF_XDP PMD can work in an unprivileged container environment. >>> Part of the AF_XDP PMD initialization process involves loading an eBPF >>> program onto the given netdev. This operation requires privileges, >>> which prevents the PMD from being able to work in an unprivileged >>> container (without root access). The plugin CNI handles the program >>> loading. CNI open Unix Domain Socket (UDS) and waits listening for a >>> client to make requests over that UDS. The client(DPDK) connects and a >>> "handshake" occurs, then the File Descriptor which points to the >>> XSKMAP associated with the loaded eBPF program is handed over to the >>> client. The client can then proceed with creating an AF_XDP socket and >>> inserting the socket into the XSKMAP pointed to by the FD received on >>> the UDS. >>> >>> A new vdev arg "use_cni" is created to indicate user wishes to run the >>> PMD in unprivileged mode and to receive the XSKMAP FD from the CNI. >>> When this flag is set, the XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD libbpf >>> flag should be used when creating the socket, which tells libbpf not >>> to load the default libbpf program on the netdev. We tell libbpf not >>> to do this because the loading is handled by the CNI in this scenario. >>> >>> Patch include howto doc explain how to configure AF_XDP CNI to working >>> with DPDK. >>> >>> [1]: https://github.com/intel/afxdp-plugins-for-kubernetes >>> >>> Signed-off-by: Shibin Koikkara Reeny <shibin.koikkara.re...@intel.com> >> >> >> Is Anatoly's tested-by tag still valid with this version? > > Yes it is still valid. > >> >> <...> >> >>> @@ -1413,7 +1678,23 @@ xsk_configure(struct pmd_internals *internals, >> struct pkt_rx_queue *rxq, >>> } >>> } >>> >>> - if (rxq->busy_budget) { >>> + if (internals->use_cni) { >>> + int err, fd, map_fd; >>> + >>> + /* get socket fd from CNI plugin */ >>> + map_fd = get_cni_fd(internals->if_name); >>> + if (map_fd < 0) { >>> + AF_XDP_LOG(ERR, "Failed to receive CNI plugin >> fd\n"); >>> + goto out_xsk; >>> + } >>> + /* get socket fd */ >>> + fd = xsk_socket__fd(rxq->xsk); >>> + err = bpf_map_update_elem(map_fd, &rxq- >>> xsk_queue_idx, &fd, 0); >>> + if (err) { >>> + AF_XDP_LOG(ERR, "Failed to insert unprivileged xsk >> in map.\n"); >>> + goto out_xsk; >>> + } >>> + } else if (rxq->busy_budget) { >> >> >> 'use_cni' argument is added as if-else, this result 'use_cni' parameter >> automatically makes 'busy_budget' argument ineffective, is this intentional? >> If so can you please describe why? >> And can you please document this in the driver documentation that 'use_cni' >> and 'busy_budget' paramters are mutually exclusive. >> May be this condition can be checked and an error message sent in runtime, >> not sure. >> > > When we use "use_cni" option inorder to configure the busy_budget we need to > send the request to the CNI plugin > and CNI plugin will configure the busy_poll. As the dpdk is running inside a > container with limited permissions. > >> >> Similarly, another parameter check above this (not visible in this patch), >> xdp_prog (custom_prog_configured) is calling same APIs >> (bpf_map_update_elem()), if both paramters are provided, 'use_cni' will >> overwrite previous one, is this intentional? >> Are 'use_cni' & 'xdp_prog' paramters mutually exclusive? > > When we use "use_cni" we don't have the permission to load the xdp_prog. As > our privileges are limited inside the container. > CNI plugin handle the loading of the program.
Yes, but what happens if user provides 'xdp_prog' parameter? >> >> >> Overall is the combination of 'use_cni' paramter with other parameters >> tested? > > We have tested the communication with CNI plugin which load the program and > traffic flow. > I got that, but is the combination of 'use_cni' parameter with other parameters tested? Like what happens if user provides both 'xdp_prog' & 'use_cni'? There is no documentation for this condition or there is no check in the code that can provide some log message to user. >> >> >>> ret = configure_preferred_busy_poll(rxq); >>> if (ret) { >>> AF_XDP_LOG(ERR, "Failed configure busy >> polling.\n"); @@ -1584,6 >>> +1865,27 @@ static const struct eth_dev_ops ops = { >>> .get_monitor_addr = eth_get_monitor_addr, }; >>> >>> +/* CNI option works in unprivileged container environment >>> + * and ethernet device functionality will be reduced. So >>> + * additional customiszed eth_dev_ops struct is needed >>> + * for cni. Promiscuous enable and disable functionality >>> + * is removed. >> >> >> Why promiscuous enable and disable functionality can't be used with >> 'use_cni'? > > When we use "use_cni" we are running dpdk_testpmd inside a docker and inside > the docker we have only > limited permissions only ie the reason I have written it as "unprivileged > container environment" > it the comment. >> >> Can you please document the limitation in the driver document, also if >> possible briefly mention reason of the limitation? > > In the documentation as prerequisites we have added : > +* The Pod should have enabled the capabilities ``CAP_NET_RAW`` and > ``CAP_BPF`` > + for AF_XDP along with support for hugepages. > > In the Background: > +The standard `AF_XDP PMD`_ initialization process involves loading an eBPF > program > +onto the kernel netdev to be used by the PMD. This operation requires root or > +escalated Linux privileges and thus prevents the PMD from working in an > +unprivileged container. The AF_XDP CNI plugin handles this situation by > +providing a device plugin that performs the program loading. > > If you think we need to add more please let me know. > Hi Shibin, Thanks for the update. I think it would be good to update driver documentation, 'doc/guides/nics/af_xdp.rst', and update where 'use_cni' parameter documented with following additional information: - When 'use_cni' parameter is used, 'busy_budget' parameter is not valid and has no impact - When 'use_cni' parameter is used, 'xdp_prog' parameter is not valid and ? (what happens when provided) - enable and disable promiscuous mode is not supported, and describe briefly why (I know code has comment for it but less put it in documentation too).