[PATCH v4] net/af_xdp: AF_XDP PMD CNI Integration

Koikkara Reeny, Shibin shibin.koikkara.reeny at intel.com
Fri Feb 10 16:38:41 CET 2023



> -----Original Message-----
> From: Ferruh Yigit <ferruh.yigit at amd.com>
> Sent: Friday, February 10, 2023 1:04 PM
> To: Koikkara Reeny, Shibin <shibin.koikkara.reeny at intel.com>;
> dev at dpdk.org; Zhang, Qi Z <qi.z.zhang at intel.com>; Burakov, Anatoly
> <anatoly.burakov at intel.com>; Richardson, Bruce
> <bruce.richardson at intel.com>; Mcnamara, John
> <john.mcnamara at intel.com>
> Cc: Loftus, Ciara <ciara.loftus at intel.com>
> Subject: Re: [PATCH v4] net/af_xdp: AF_XDP PMD CNI Integration
> 
> On 2/9/2023 12:05 PM, Shibin Koikkara Reeny wrote:
> > Integrate support for the AF_XDP CNI and device plugin [1] so that the
> > DPDK AF_XDP PMD can work in an unprivileged container environment.
> > Part of the AF_XDP PMD initialization process involves loading an eBPF
> > program onto the given netdev. This operation requires privileges,
> > which prevents the PMD from being able to work in an unprivileged
> > container (without root access). The plugin CNI handles the program
> > loading. CNI open Unix Domain Socket (UDS) and waits listening for a
> > client to make requests over that UDS. The client(DPDK) connects and a
> > "handshake" occurs, then the File Descriptor which points to the
> > XSKMAP associated with the loaded eBPF program is handed over to the
> > client. The client can then proceed with creating an AF_XDP socket and
> > inserting the socket into the XSKMAP pointed to by the FD received on
> > the UDS.
> >
> > A new vdev arg "use_cni" is created to indicate user wishes to run the
> > PMD in unprivileged mode and to receive the XSKMAP FD from the CNI.
> > When this flag is set, the XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD libbpf
> > flag should be used when creating the socket, which tells libbpf not
> > to load the default libbpf program on the netdev. We tell libbpf not
> > to do this because the loading is handled by the CNI in this scenario.
> >
> > Patch include howto doc explain how to configure AF_XDP CNI to working
> > with DPDK.
> >
> > [1]: https://github.com/intel/afxdp-plugins-for-kubernetes
> >
> > Signed-off-by: Shibin Koikkara Reeny <shibin.koikkara.reeny at intel.com>
> 
> 
> Is Anatoly's tested-by tag still valid with this version?

Yes it is still valid.

> 
> <...>
> 
> > @@ -1413,7 +1678,23 @@ xsk_configure(struct pmd_internals *internals,
> struct pkt_rx_queue *rxq,
> >  		}
> >  	}
> >
> > -	if (rxq->busy_budget) {
> > +	if (internals->use_cni) {
> > +		int err, fd, map_fd;
> > +
> > +		/* get socket fd from CNI plugin */
> > +		map_fd = get_cni_fd(internals->if_name);
> > +		if (map_fd < 0) {
> > +			AF_XDP_LOG(ERR, "Failed to receive CNI plugin
> fd\n");
> > +			goto out_xsk;
> > +		}
> > +		/* get socket fd */
> > +		fd = xsk_socket__fd(rxq->xsk);
> > +		err = bpf_map_update_elem(map_fd, &rxq-
> >xsk_queue_idx, &fd, 0);
> > +		if (err) {
> > +			AF_XDP_LOG(ERR, "Failed to insert unprivileged xsk
> in map.\n");
> > +			goto out_xsk;
> > +		}
> > +	} else if (rxq->busy_budget) {
> 
> 
> 'use_cni' argument is added as if-else, this result 'use_cni' parameter
> automatically makes 'busy_budget' argument ineffective, is this intentional?
> If so can you please describe why?
> And can you please document this in the driver documentation that 'use_cni'
> and 'busy_budget' paramters are mutually exclusive.
> May be this condition can be checked and an error message sent in runtime,
> not sure.
> 

When we use "use_cni" option inorder to configure the busy_budget we need to send the request to the CNI plugin
and CNI plugin will configure the busy_poll. As the dpdk is running inside a container with limited permissions.

> 
> Similarly, another parameter check above this (not visible in this patch),
> xdp_prog (custom_prog_configured) is calling same APIs
> (bpf_map_update_elem()), if both paramters are provided, 'use_cni' will
> overwrite previous one, is this intentional?
> Are 'use_cni' & 'xdp_prog' paramters mutually exclusive?

When we use "use_cni" we don't have the permission to load the xdp_prog. As our privileges are limited inside the container.
CNI plugin handle the loading of the program.
> 
> 
> Overall is the combination of 'use_cni' paramter with other parameters
> tested?

We have tested the communication with CNI plugin which load the program and traffic flow.
 
> 
> 
> >  		ret = configure_preferred_busy_poll(rxq);
> >  		if (ret) {
> >  			AF_XDP_LOG(ERR, "Failed configure busy
> polling.\n"); @@ -1584,6
> > +1865,27 @@ static const struct eth_dev_ops ops = {
> >  	.get_monitor_addr = eth_get_monitor_addr,  };
> >
> > +/* CNI option works in unprivileged container environment
> > + * and ethernet device functionality will be reduced. So
> > + * additional customiszed eth_dev_ops struct is needed
> > + * for cni. Promiscuous enable and disable functionality
> > + * is removed.
> 
> 
> Why promiscuous enable and disable functionality can't be used with
> 'use_cni'?

When we use "use_cni" we are running dpdk_testpmd inside a docker and inside the docker we have only 
limited permissions only ie the reason I have written it as "unprivileged container environment"
it the comment.
> 
> Can you please document the limitation in the driver document, also if
> possible briefly mention reason of the limitation?

In the documentation as prerequisites we have added :
+* The Pod should have enabled the capabilities ``CAP_NET_RAW`` and ``CAP_BPF``
+  for AF_XDP along with support for hugepages.

In the Background:
+The standard `AF_XDP PMD`_ initialization process involves loading an eBPF program
+onto the kernel netdev to be used by the PMD. This operation requires root or
+escalated Linux privileges and thus prevents the PMD from working in an
+unprivileged container. The AF_XDP CNI plugin handles this situation by
+providing a device plugin that performs the program loading.

If you think we need to add more please let me know.






More information about the dev mailing list