[dpdk-dev] af_packet dev default "framesz" of 2048B
Lam, Tiago
tiago.lam at intel.com
Tue Nov 20 11:28:43 CET 2018
On 16/11/2018 17:20, Ferruh Yigit wrote:
> On 11/15/2018 7:02 PM, Lam, Tiago wrote:
>> Hi guys,
>>
>> OvS-DPDK has recently had small a change that changed the data room
>> available in an mbuf (commit dfaf00e in OvS). This seems to have had the
>> consequence of breaking the initialisation of eth_af_packets interfaces,
>> when using default values ("options:dpdk-
>> devargs=eth_af_packet0,iface=enp61s0f3").
>>
>> After investigating, what seems to be happening is that the
>> eth_af_packet dev expects an available space of "2048B - TPACKET2_HDRLEN
>> + sizeof(struct sockaddr_ll) = 2016B" to be available in the data room
>> of each mbuf. Previous to the above commit, OvS would allocate some
>> extra space, and this would mean there would be enough room for the
>> checks performed in eth_rx_queue_setup() and eth_dev_mtu_set() in
>> rte_eth_af_packet.c. However, with the recent commit that isn't the case
>> anymore, and without that extra space the first check in
>> eth_rx_queue_setup() will now be hit and setup of a eth_af_packet
>> interface fails.
>>
>> What I'm trying to understand here is, the logic behind setting a
>> default 'framesz' of 2048B and it being hardcoded (instead of being
>> based on the underlying MTU of the interface, or the mbuf data room
>> directly). The documentation in [1] for mmap() and setting up buffer
>> rings mentions the exact same values
>> (tp_block_size=4096,tp_frame_size=2048), which seem to have been
>> introduced on the first commit, back in 2014. The only constraint
>> for the framesize, it seems, its that it fits inside the blocksize (i.e.
>> doesn't span multiple blocksizes), and is aligned to TPACKET_ALIGNMENT.
>
> Independent from Bruce's comment to not use smaller mbuf size, I believe
> af_packet can be updated to be more flexible.
>
> Related to 'framesize', it has default value 2048 but can be updated via
> 'framesz=' devarg, so not exactly hardcoded. Your change looks good there,
> use devarg value if exist, get from underlying device and set default if it fails.
>
> A few comment below.
Thanks for the review, Ferruh. This was only a proposal, but since you
reviewed it, I've addressed the comments and sent a formal patch:
http://patchwork.dpdk.org/patch/48200/
Thanks,
Tiago.
>
>>
>> Thus, given those constraints, could we instead base the setting of the
>> framesize like on the below patch? It basically tries to set the
>> framesize to the MTU of the underlying interface + TPACKET2_HDRLEN +
>> sizeof(structsockaddr_ll), aligned to TPACKET_ALIGNMENT. This would
>> allow applications such as OvS to use this interface by default, based
>> on a framesize set dynamically, instead of relying on a default of
>> 2048B; If one is setting up an eth_af_packet interface with MTU 9000B
>> and the underlying MTU is 1500B, for example, this will still fail, and
>> that's fine as they can always supply the "framesz" argument
>> ("options:dpdk-devargs=eth_af_packet0,iface=enp61s0f3,framesz=9000").
>> However, on the default case, or when one has configured manually the
>> underlying interface with the expected MTU already, this would work out
>> of the box.
>>
>> Any thoughts on this?
>>
>> Thanks,
>> Tiago.
>>
>> diff --git a/drivers/net/af_packet/rte_eth_af_packet.c b/drivers/net/af_packet/rte_eth_af_packet.c
>> index 95a98c6..451cec3 100644
>> --- a/drivers/net/af_packet/rte_eth_af_packet.c
>> +++ b/drivers/net/af_packet/rte_eth_af_packet.c
>> @@ -24,6 +24,8 @@
>> #include <unistd.h>
>> #include <poll.h>
>>
>> +#define RTE_ROUNDUP(x, y) ((((x) + ((y) - 1)) / (y)) * (y))
>
> This can go into more generic header
>
> <...>
>
>> @@ -622,6 +617,38 @@ rte_pmd_init_internals(struct rte_vdev_device *dev,
>> return -1;
>> }
>> memcpy(&(*internals)->eth_addr, ifr.ifr_hwaddr.sa_data, ETH_ALEN);
>> + if (!framesize) {
>> + if (ioctl(sockfd, SIOCGIFMTU, &ifr) == -1) {
>> + RTE_LOG(ERR, PMD,
>> + "%s: ioctl failed (SIOCGIFMTU)",
>> + name);
>> + framesize = DFLT_FRAME_SIZE;
>> + } else {
>> + framesize = ifr.ifr_mtu;
>> + framesize += TPACKET2_HDRLEN + sizeof(struct sockaddr_ll);
>> + framesize = RTE_ROUNDUP(framesize, TPACKET_ALIGNMENT);
>> + }
>> + }
>> +
>> + blockcnt = framecnt / (blocksize / framesize);
>
> If returned framesize > blocksize, result will be 0 since both are uint, which
> will end framecnt/0 and crash.
>
> <...>
>
>> @@ -887,21 +913,8 @@ rte_eth_from_packet(struct rte_vdev_device *dev,
>> return -1;
>> }
>>
>> - blockcount = framecount / (blocksize / framesize);
>> - if (!blockcount) {
>> - PMD_LOG(ERR,
>> - "%s: invalid AF_PACKET MMAP parameters", name);
>> - return -1;
>> - }
>> -
>> - PMD_LOG(INFO, "%s: AF_PACKET MMAP parameters:", name);
>> - PMD_LOG(INFO, "%s:\tblock size %d", name, blocksize);
>> - PMD_LOG(INFO, "%s:\tblock count %d", name, blockcount);
>> - PMD_LOG(INFO, "%s:\tframe size %d", name, framesize);
>> - PMD_LOG(INFO, "%s:\tframe count %d", name, framecount);
>> -
>
> Why move this block or change rte_pmd_init_internals(), adding SIOCGIFMTU above
> this block looks simpler.
>
More information about the dev
mailing list