[dpdk-dev] af_packet dev default "framesz" of 2048B

Lam, Tiago tiago.lam at intel.com
Tue Nov 20 11:28:43 CET 2018


On 16/11/2018 17:20, Ferruh Yigit wrote:
> On 11/15/2018 7:02 PM, Lam, Tiago wrote:
>> Hi guys,
>>
>> OvS-DPDK has recently had small a change that changed the data room
>> available in an mbuf (commit dfaf00e in OvS). This seems to have had the
>> consequence of breaking the initialisation of eth_af_packets interfaces,
>> when using default values ("options:dpdk-
>> devargs=eth_af_packet0,iface=enp61s0f3").
>>
>> After investigating, what seems to be happening is that the
>> eth_af_packet dev expects an available space of "2048B - TPACKET2_HDRLEN
>> + sizeof(struct sockaddr_ll) = 2016B" to be available in the data room
>> of each mbuf.  Previous to the above commit, OvS would allocate some
>> extra space, and this would mean there would be enough room for the
>> checks performed in eth_rx_queue_setup() and eth_dev_mtu_set() in
>> rte_eth_af_packet.c. However, with the recent commit that isn't the case
>> anymore, and without that extra space the first check in
>> eth_rx_queue_setup() will now be hit and setup of a eth_af_packet
>> interface fails.
>>
>> What I'm trying to understand here is, the logic behind setting a
>> default 'framesz' of 2048B and it being hardcoded (instead of being
>> based on the underlying MTU of the interface, or the mbuf data room
>> directly). The documentation in [1] for mmap() and setting up buffer
>> rings mentions the exact same values
>> (tp_block_size=4096,tp_frame_size=2048), which seem to have been
>> introduced on the first commit, back in 2014. The only constraint
>> for the framesize, it seems, its that it fits inside the blocksize (i.e.
>> doesn't span multiple blocksizes), and is aligned to TPACKET_ALIGNMENT.
> 
> Independent from Bruce's comment to not use smaller mbuf size, I believe
> af_packet can be updated to be more flexible.
> 
> Related to 'framesize', it has default value 2048 but can be updated via
> 'framesz=' devarg, so not exactly hardcoded. Your change looks good there,
> use devarg value if exist, get from underlying device and set default if it fails.
> 
> A few comment below.

Thanks for the review, Ferruh. This was only a proposal, but since you
reviewed it, I've addressed the comments and sent a formal patch:
http://patchwork.dpdk.org/patch/48200/

Thanks,
Tiago.

> 
>>
>> Thus, given those constraints, could we instead base the setting of the
>> framesize like on the below patch? It basically tries to set the
>> framesize to the MTU of the underlying interface +  TPACKET2_HDRLEN +
>> sizeof(structsockaddr_ll), aligned to TPACKET_ALIGNMENT. This would
>> allow applications such as OvS to use this interface by default, based
>> on a framesize set dynamically, instead of relying on a default of
>> 2048B; If one is setting up an eth_af_packet interface with MTU 9000B
>> and the underlying MTU is 1500B, for example, this will still fail, and
>> that's fine as they can always supply the "framesz" argument
>> ("options:dpdk-devargs=eth_af_packet0,iface=enp61s0f3,framesz=9000").
>> However, on the default case, or when one has configured manually the
>> underlying interface with the expected MTU already, this would work out
>> of the box.
>>
>> Any thoughts on this?
>>
>> Thanks,
>> Tiago.
>>
>> diff --git a/drivers/net/af_packet/rte_eth_af_packet.c b/drivers/net/af_packet/rte_eth_af_packet.c
>> index 95a98c6..451cec3 100644
>> --- a/drivers/net/af_packet/rte_eth_af_packet.c
>> +++ b/drivers/net/af_packet/rte_eth_af_packet.c
>> @@ -24,6 +24,8 @@
>>  #include <unistd.h>
>>  #include <poll.h>
>>
>> +#define RTE_ROUNDUP(x, y) ((((x) + ((y) - 1)) / (y)) * (y))
> 
> This can go into more generic header
> 
> <...>
> 
>> @@ -622,6 +617,38 @@ rte_pmd_init_internals(struct rte_vdev_device *dev,
>>  		return -1;
>>  	}
>>  	memcpy(&(*internals)->eth_addr, ifr.ifr_hwaddr.sa_data, ETH_ALEN);
>> +	if (!framesize) {
>> +		if (ioctl(sockfd, SIOCGIFMTU, &ifr) == -1) {
>> +			RTE_LOG(ERR, PMD,
>> +				"%s: ioctl failed (SIOCGIFMTU)",
>> +				name);
>> +			framesize = DFLT_FRAME_SIZE;
>> +		} else {
>> +			framesize = ifr.ifr_mtu;
>> +			framesize += TPACKET2_HDRLEN + sizeof(struct sockaddr_ll);
>> +			framesize = RTE_ROUNDUP(framesize, TPACKET_ALIGNMENT);
>> +        }
>> +	}
>> +
>> +	blockcnt = framecnt / (blocksize / framesize);
> 
> If returned framesize > blocksize, result will be 0 since both are uint, which
> will end framecnt/0 and crash.
> 
> <...>
> 
>> @@ -887,21 +913,8 @@ rte_eth_from_packet(struct rte_vdev_device *dev,
>>  		return -1;
>>  	}
>>
>> -	blockcount = framecount / (blocksize / framesize);
>> -	if (!blockcount) {
>> -		PMD_LOG(ERR,
>> -			"%s: invalid AF_PACKET MMAP parameters", name);
>> -		return -1;
>> -	}
>> -
>> -	PMD_LOG(INFO, "%s: AF_PACKET MMAP parameters:", name);
>> -	PMD_LOG(INFO, "%s:\tblock size %d", name, blocksize);
>> -	PMD_LOG(INFO, "%s:\tblock count %d", name, blockcount);
>> -	PMD_LOG(INFO, "%s:\tframe size %d", name, framesize);
>> -	PMD_LOG(INFO, "%s:\tframe count %d", name, framecount);
>> -
> 
> Why move this block or change rte_pmd_init_internals(), adding SIOCGIFMTU above
> this block looks simpler.
> 


More information about the dev mailing list