[dpdk-dev] af_packet dev default "framesz" of 2048B

Ferruh Yigit ferruh.yigit at intel.com
Fri Nov 16 18:20:13 CET 2018


On 11/15/2018 7:02 PM, Lam, Tiago wrote:
> Hi guys,
> 
> OvS-DPDK has recently had small a change that changed the data room
> available in an mbuf (commit dfaf00e in OvS). This seems to have had the
> consequence of breaking the initialisation of eth_af_packets interfaces,
> when using default values ("options:dpdk-
> devargs=eth_af_packet0,iface=enp61s0f3").
> 
> After investigating, what seems to be happening is that the
> eth_af_packet dev expects an available space of "2048B - TPACKET2_HDRLEN
> + sizeof(struct sockaddr_ll) = 2016B" to be available in the data room
> of each mbuf.  Previous to the above commit, OvS would allocate some
> extra space, and this would mean there would be enough room for the
> checks performed in eth_rx_queue_setup() and eth_dev_mtu_set() in
> rte_eth_af_packet.c. However, with the recent commit that isn't the case
> anymore, and without that extra space the first check in
> eth_rx_queue_setup() will now be hit and setup of a eth_af_packet
> interface fails.
> 
> What I'm trying to understand here is, the logic behind setting a
> default 'framesz' of 2048B and it being hardcoded (instead of being
> based on the underlying MTU of the interface, or the mbuf data room
> directly). The documentation in [1] for mmap() and setting up buffer
> rings mentions the exact same values
> (tp_block_size=4096,tp_frame_size=2048), which seem to have been
> introduced on the first commit, back in 2014. The only constraint
> for the framesize, it seems, its that it fits inside the blocksize (i.e.
> doesn't span multiple blocksizes), and is aligned to TPACKET_ALIGNMENT.

Independent from Bruce's comment to not use smaller mbuf size, I believe
af_packet can be updated to be more flexible.

Related to 'framesize', it has default value 2048 but can be updated via
'framesz=' devarg, so not exactly hardcoded. Your change looks good there,
use devarg value if exist, get from underlying device and set default if it fails.

A few comment below.

> 
> Thus, given those constraints, could we instead base the setting of the
> framesize like on the below patch? It basically tries to set the
> framesize to the MTU of the underlying interface +  TPACKET2_HDRLEN +
> sizeof(structsockaddr_ll), aligned to TPACKET_ALIGNMENT. This would
> allow applications such as OvS to use this interface by default, based
> on a framesize set dynamically, instead of relying on a default of
> 2048B; If one is setting up an eth_af_packet interface with MTU 9000B
> and the underlying MTU is 1500B, for example, this will still fail, and
> that's fine as they can always supply the "framesz" argument
> ("options:dpdk-devargs=eth_af_packet0,iface=enp61s0f3,framesz=9000").
> However, on the default case, or when one has configured manually the
> underlying interface with the expected MTU already, this would work out
> of the box.
> 
> Any thoughts on this?
> 
> Thanks,
> Tiago.
> 
> diff --git a/drivers/net/af_packet/rte_eth_af_packet.c b/drivers/net/af_packet/rte_eth_af_packet.c
> index 95a98c6..451cec3 100644
> --- a/drivers/net/af_packet/rte_eth_af_packet.c
> +++ b/drivers/net/af_packet/rte_eth_af_packet.c
> @@ -24,6 +24,8 @@
>  #include <unistd.h>
>  #include <poll.h>
> 
> +#define RTE_ROUNDUP(x, y) ((((x) + ((y) - 1)) / (y)) * (y))

This can go into more generic header

<...>

> @@ -622,6 +617,38 @@ rte_pmd_init_internals(struct rte_vdev_device *dev,
>  		return -1;
>  	}
>  	memcpy(&(*internals)->eth_addr, ifr.ifr_hwaddr.sa_data, ETH_ALEN);
> +	if (!framesize) {
> +		if (ioctl(sockfd, SIOCGIFMTU, &ifr) == -1) {
> +			RTE_LOG(ERR, PMD,
> +				"%s: ioctl failed (SIOCGIFMTU)",
> +				name);
> +			framesize = DFLT_FRAME_SIZE;
> +		} else {
> +			framesize = ifr.ifr_mtu;
> +			framesize += TPACKET2_HDRLEN + sizeof(struct sockaddr_ll);
> +			framesize = RTE_ROUNDUP(framesize, TPACKET_ALIGNMENT);
> +        }
> +	}
> +
> +	blockcnt = framecnt / (blocksize / framesize);

If returned framesize > blocksize, result will be 0 since both are uint, which
will end framecnt/0 and crash.

<...>

> @@ -887,21 +913,8 @@ rte_eth_from_packet(struct rte_vdev_device *dev,
>  		return -1;
>  	}
> 
> -	blockcount = framecount / (blocksize / framesize);
> -	if (!blockcount) {
> -		PMD_LOG(ERR,
> -			"%s: invalid AF_PACKET MMAP parameters", name);
> -		return -1;
> -	}
> -
> -	PMD_LOG(INFO, "%s: AF_PACKET MMAP parameters:", name);
> -	PMD_LOG(INFO, "%s:\tblock size %d", name, blocksize);
> -	PMD_LOG(INFO, "%s:\tblock count %d", name, blockcount);
> -	PMD_LOG(INFO, "%s:\tframe size %d", name, framesize);
> -	PMD_LOG(INFO, "%s:\tframe count %d", name, framecount);
> -

Why move this block or change rte_pmd_init_internals(), adding SIOCGIFMTU above
this block looks simpler.


More information about the dev mailing list