[PATCH v4] net/af_xdp: AF_XDP PMD CNI Integration

Ferruh Yigit ferruh.yigit at amd.com
Fri Feb 10 21:50:29 CET 2023


On 2/10/2023 3:38 PM, Koikkara Reeny, Shibin wrote:
> 
> 
>> -----Original Message-----
>> From: Ferruh Yigit <ferruh.yigit at amd.com>
>> Sent: Friday, February 10, 2023 1:04 PM
>> To: Koikkara Reeny, Shibin <shibin.koikkara.reeny at intel.com>;
>> dev at dpdk.org; Zhang, Qi Z <qi.z.zhang at intel.com>; Burakov, Anatoly
>> <anatoly.burakov at intel.com>; Richardson, Bruce
>> <bruce.richardson at intel.com>; Mcnamara, John
>> <john.mcnamara at intel.com>
>> Cc: Loftus, Ciara <ciara.loftus at intel.com>
>> Subject: Re: [PATCH v4] net/af_xdp: AF_XDP PMD CNI Integration
>>
>> On 2/9/2023 12:05 PM, Shibin Koikkara Reeny wrote:
>>> Integrate support for the AF_XDP CNI and device plugin [1] so that the
>>> DPDK AF_XDP PMD can work in an unprivileged container environment.
>>> Part of the AF_XDP PMD initialization process involves loading an eBPF
>>> program onto the given netdev. This operation requires privileges,
>>> which prevents the PMD from being able to work in an unprivileged
>>> container (without root access). The plugin CNI handles the program
>>> loading. CNI open Unix Domain Socket (UDS) and waits listening for a
>>> client to make requests over that UDS. The client(DPDK) connects and a
>>> "handshake" occurs, then the File Descriptor which points to the
>>> XSKMAP associated with the loaded eBPF program is handed over to the
>>> client. The client can then proceed with creating an AF_XDP socket and
>>> inserting the socket into the XSKMAP pointed to by the FD received on
>>> the UDS.
>>>
>>> A new vdev arg "use_cni" is created to indicate user wishes to run the
>>> PMD in unprivileged mode and to receive the XSKMAP FD from the CNI.
>>> When this flag is set, the XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD libbpf
>>> flag should be used when creating the socket, which tells libbpf not
>>> to load the default libbpf program on the netdev. We tell libbpf not
>>> to do this because the loading is handled by the CNI in this scenario.
>>>
>>> Patch include howto doc explain how to configure AF_XDP CNI to working
>>> with DPDK.
>>>
>>> [1]: https://github.com/intel/afxdp-plugins-for-kubernetes
>>>
>>> Signed-off-by: Shibin Koikkara Reeny <shibin.koikkara.reeny at intel.com>
>>
>>
>> Is Anatoly's tested-by tag still valid with this version?
> 
> Yes it is still valid.
> 
>>
>> <...>
>>
>>> @@ -1413,7 +1678,23 @@ xsk_configure(struct pmd_internals *internals,
>> struct pkt_rx_queue *rxq,
>>>  		}
>>>  	}
>>>
>>> -	if (rxq->busy_budget) {
>>> +	if (internals->use_cni) {
>>> +		int err, fd, map_fd;
>>> +
>>> +		/* get socket fd from CNI plugin */
>>> +		map_fd = get_cni_fd(internals->if_name);
>>> +		if (map_fd < 0) {
>>> +			AF_XDP_LOG(ERR, "Failed to receive CNI plugin
>> fd\n");
>>> +			goto out_xsk;
>>> +		}
>>> +		/* get socket fd */
>>> +		fd = xsk_socket__fd(rxq->xsk);
>>> +		err = bpf_map_update_elem(map_fd, &rxq-
>>> xsk_queue_idx, &fd, 0);
>>> +		if (err) {
>>> +			AF_XDP_LOG(ERR, "Failed to insert unprivileged xsk
>> in map.\n");
>>> +			goto out_xsk;
>>> +		}
>>> +	} else if (rxq->busy_budget) {
>>
>>
>> 'use_cni' argument is added as if-else, this result 'use_cni' parameter
>> automatically makes 'busy_budget' argument ineffective, is this intentional?
>> If so can you please describe why?
>> And can you please document this in the driver documentation that 'use_cni'
>> and 'busy_budget' paramters are mutually exclusive.
>> May be this condition can be checked and an error message sent in runtime,
>> not sure.
>>
> 
> When we use "use_cni" option inorder to configure the busy_budget we need to send the request to the CNI plugin
> and CNI plugin will configure the busy_poll. As the dpdk is running inside a container with limited permissions.
> 
>>
>> Similarly, another parameter check above this (not visible in this patch),
>> xdp_prog (custom_prog_configured) is calling same APIs
>> (bpf_map_update_elem()), if both paramters are provided, 'use_cni' will
>> overwrite previous one, is this intentional?
>> Are 'use_cni' & 'xdp_prog' paramters mutually exclusive?
> 
> When we use "use_cni" we don't have the permission to load the xdp_prog. As our privileges are limited inside the container.
> CNI plugin handle the loading of the program.


Yes, but what happens if user provides 'xdp_prog' parameter?

>>
>>
>> Overall is the combination of 'use_cni' paramter with other parameters
>> tested?
> 
> We have tested the communication with CNI plugin which load the program and traffic flow.
>  

I got that, but is the combination of 'use_cni' parameter with other
parameters tested?

Like what happens if user provides both 'xdp_prog' & 'use_cni'?
There is no documentation for this condition or there is no check in the
code that can provide some log message to user.

>>
>>
>>>  		ret = configure_preferred_busy_poll(rxq);
>>>  		if (ret) {
>>>  			AF_XDP_LOG(ERR, "Failed configure busy
>> polling.\n"); @@ -1584,6
>>> +1865,27 @@ static const struct eth_dev_ops ops = {
>>>  	.get_monitor_addr = eth_get_monitor_addr,  };
>>>
>>> +/* CNI option works in unprivileged container environment
>>> + * and ethernet device functionality will be reduced. So
>>> + * additional customiszed eth_dev_ops struct is needed
>>> + * for cni. Promiscuous enable and disable functionality
>>> + * is removed.
>>
>>
>> Why promiscuous enable and disable functionality can't be used with
>> 'use_cni'?
> 
> When we use "use_cni" we are running dpdk_testpmd inside a docker and inside the docker we have only 
> limited permissions only ie the reason I have written it as "unprivileged container environment"
> it the comment.
>>
>> Can you please document the limitation in the driver document, also if
>> possible briefly mention reason of the limitation?
> 
> In the documentation as prerequisites we have added :
> +* The Pod should have enabled the capabilities ``CAP_NET_RAW`` and ``CAP_BPF``
> +  for AF_XDP along with support for hugepages.
> 
> In the Background:
> +The standard `AF_XDP PMD`_ initialization process involves loading an eBPF program
> +onto the kernel netdev to be used by the PMD. This operation requires root or
> +escalated Linux privileges and thus prevents the PMD from working in an
> +unprivileged container. The AF_XDP CNI plugin handles this situation by
> +providing a device plugin that performs the program loading.
> 
> If you think we need to add more please let me know.
> 

Hi Shibin,

Thanks for the update.

I think it would be good to update driver documentation,
'doc/guides/nics/af_xdp.rst', and update where 'use_cni' parameter
documented with following additional information:

- When 'use_cni' parameter is used, 'busy_budget' parameter is not valid
and has no impact
- When 'use_cni' parameter is used, 'xdp_prog' parameter is not valid
and ? (what happens when provided)
- enable and disable promiscuous mode is not supported, and describe
briefly why (I know code has comment for it but less put it in
documentation too).






More information about the dev mailing list