[dpdk-dev] [PATCH v2] net/kni: calc mbuf&mtu according to given mb_pool

Ferruh Yigit ferruh.yigit at intel.com
Wed Mar 20 20:48:46 CET 2019


On 3/17/2019 9:43 AM, Liron Himi wrote:
> 
> 
> -----Original Message-----
> From: Ferruh Yigit <ferruh.yigit at intel.com> 
> Sent: Friday, March 15, 2019 19:59
> To: Liron Himi <lironh at marvell.com>
> Cc: dev at dpdk.org; Alan Winkowski <walan at marvell.com>
> Subject: Re: [PATCH v2] net/kni: calc mbuf&mtu according to given mb_pool
> 
> On 3/15/2019 5:02 PM, Liron Himi wrote:
>>
>>
>> -----Original Message-----
>> From: Ferruh Yigit <ferruh.yigit at intel.com>
>> Sent: Thursday, March 14, 2019 11:28
>> To: Liron Himi <lironh at marvell.com>
>> Cc: dev at dpdk.org; Alan Winkowski <walan at marvell.com>
>> Subject: Re: [PATCH v2] net/kni: calc mbuf&mtu according to given 
>> mb_pool
>>
>> On 3/14/2019 6:37 AM, Liron Himi wrote:
>>>
>>>
>>> -----Original Message-----
>>> From: Ferruh Yigit <ferruh.yigit at intel.com>
>>> Sent: Wednesday, March 13, 2019 18:58
>>> To: Liron Himi <lironh at marvell.com>
>>> Cc: dev at dpdk.org; Alan Winkowski <walan at marvell.com>
>>> Subject: Re: [PATCH v2] net/kni: calc mbuf&mtu according to given 
>>> mb_pool
>>>
>>> On 3/10/2019 2:27 PM, Liron Himi wrote:
>>>> Adding Alan.
>>>>
>>>> -----Original Message-----
>>>> From: Liron Himi
>>>> Sent: Monday, February 25, 2019 13:30
>>>> To: ferruh.yigit at intel.com
>>>> Cc: dev at dpdk.org; Liron Himi <lironh at marvell.com>; Liron Himi 
>>>> <lironh at marvell.com>
>>>> Subject: RE: [PATCH v2] net/kni: calc mbuf&mtu according to given 
>>>> mb_pool
>>>>
>>>> Hi,
>>>>
>>>> Kind reminder
>>>
>>> Sorry for late response.
>>>
>>>>
>>>> Regards,
>>>> Liron
>>>>
>>>> -----Original Message-----
>>>> From: lironh at marvell.com <lironh at marvell.com>
>>>> Sent: Saturday, February 23, 2019 22:15
>>>> To: ferruh.yigit at intel.com
>>>> Cc: dev at dpdk.org; Liron Himi <lironh at marvell.com>
>>>> Subject: [PATCH v2] net/kni: calc mbuf&mtu according to given 
>>>> mb_pool
>>>>
>>>> From: Liron Himi <lironh at marvell.com>
>>>>
>>>> - mbuf_size and mtu are now being calculated according to the given mb-pool.
>>>
>>> +1 to have dynamic size instead of fixed "MAX_PACKET_SZ"
>>>
>>>>
>>>> - max_mtu is now being set according to the given mtu
>>>>
>>>> the above two changes provide the ability to work with jumbo frames
>>>
>>> From kernel -> userspace, if the data length is bigger than
>>> mbuf->buffer_len (-
>>> headroom) the packet is dropped. I guess you are trying to solve that issue?
>>> [L.H.] correct
>>>
>>> By providing larger mbuf buffer, it should be possible to send larger (jumbo) packets?
>>> [L.H.] correct
>>>
>>> Another option can be adding multi segment send support, that also lets sending large packets from kernel to userspace, and it can co-exits with your patch.
>>> What do you think, can you work on that support?
>>> [L.H.] I suggest to first go with this patch, and then prepare 
>>> multi-segment patch if possible
>>
>> Yes, I was hoping both can go in a same patchset, can it be possible?
>> [L.H.] I'm on tight schedule right now, I prefer to continue with  this patch as is, multi-segment support can be pushed later on.
> 
> OK
> 
>>
>>> Multi segment support already exists in userspace to kernel path, but otherway around is missing.
>>>
>>>>
>>>> Signed-off-by: Liron Himi <lironh at marvell.com>
>>>> ---
>>>>  drivers/net/kni/rte_eth_kni.c | 10 +++++++---
>>>>  kernel/linux/kni/compat.h     |  4 ++++
>>>>  kernel/linux/kni/kni_misc.c   |  3 +++
>>>
>>> It can be good to update release notes / kni documentation to document new feature.
>>> [L.H.] okay
> [L.H.] I have made the following change, but I'm not sure to which document to mark the adding of this new feature.
> Is it release notes? If yes, which exact one? 

I think feature is big enough to add to release notes, the one for current
release is: "doc/guides/rel_notes/release_19_05.rst".

> 
> Should I mark it as Jumbo support? Or just specify that the mtu and mbuf are based on the given pool?

I was thinking "doc/guides/prog_guide/kernel_nic_interface.rst" instead of kni
PMD doc.

In that doc, in "KNI Creation and Deletion" section, there is a paragraph about
"struct rte_kni_conf", I think appending a single sentences to that paragraph to
say by default mtu is set as mbuf buffer length is good.

> 
> 
> diff --git a/doc/guides/nics/kni.rst b/doc/guides/nics/kni.rst
> index 204fbd5..a66c595 100644
> --- a/doc/guides/nics/kni.rst
> +++ b/doc/guides/nics/kni.rst
> @@ -55,7 +55,8 @@ configuration:
>  
>          Interface name: kni#
>          force bind kernel thread to a core : NO
> -        mbuf size: MAX_PACKET_SZ
> +        mbuf size: (rte_pktmbuf_data_room_size(pktmbuf_pool) - RTE_PKTMBUF_HEADROOM)
> +        mtu: (conf.mbuf_size - ETHER_HDR_LEN)
>>>
>>>>  3 files changed, 14 insertions(+), 3 deletions(-)
>>>>
>>>> diff --git a/drivers/net/kni/rte_eth_kni.c 
>>>> b/drivers/net/kni/rte_eth_kni.c index a1e9970..5e02224 100644
>>>> --- a/drivers/net/kni/rte_eth_kni.c
>>>> +++ b/drivers/net/kni/rte_eth_kni.c
>>>> @@ -16,9 +16,11 @@
>>>>  /* Only single queue supported */
>>>>  #define KNI_MAX_QUEUE_PER_PORT 1
>>>>  
>>>> -#define MAX_PACKET_SZ 2048
>>>>  #define MAX_KNI_PORTS 8
>>>>  
>>>> +#define KNI_ETHER_MTU(mbuf_size)       \
>>>> +	((mbuf_size) - ETHER_HDR_LEN) /**< Ethernet MTU. */
>>>> +
>>>>  #define ETH_KNI_NO_REQUEST_THREAD_ARG	"no_request_thread"
>>>>  static const char * const valid_arguments[] = {
>>>>  	ETH_KNI_NO_REQUEST_THREAD_ARG,
>>>> @@ -123,11 +125,13 @@ eth_kni_start(struct rte_eth_dev *dev)
>>>>  	struct rte_kni_conf conf;
>>>>  	const char *name = dev->device->name + 4; /* remove net_ */
>>>>  
>>>> +	mb_pool = internals->rx_queues[0].mb_pool;
>>>>  	snprintf(conf.name, RTE_KNI_NAMESIZE, "%s", name);
>>>>  	conf.force_bind = 0;
>>>>  	conf.group_id = port_id;
>>>> -	conf.mbuf_size = MAX_PACKET_SZ;
>>>> -	mb_pool = internals->rx_queues[0].mb_pool;
>>>> +	conf.mbuf_size =
>>>> +		rte_pktmbuf_data_room_size(mb_pool) - RTE_PKTMBUF_HEADROOM;
>>>> +	conf.mtu = KNI_ETHER_MTU(conf.mbuf_size);
>>>
>>> Can you please do "conf.mbuf_size" changes also to kni sample application?
>>> kni sample application gets mtu from physical device, so I believe better to not change that but I think mbuf_size can be dynamic instead of hardcoded.
>>> [L.H.] okay
>>>
>>> Another question, for the case mbuf size < ETHER_MTU, should we keep MTU ETHER_MTU, what do you think?
>>> [L.H.] in any case we need to set the MTU according to the mbuf-size until multi-segment support will be available, right?
>>
>> Right.
>>
>>>
>>>>  
>>>>  	internals->kni = rte_kni_alloc(mb_pool, &conf, NULL);
>>>>  	if (internals->kni == NULL) {
>>>> diff --git a/kernel/linux/kni/compat.h b/kernel/linux/kni/compat.h 
>>>> index 3c575c7..b9f9a6f 100644
>>>> --- a/kernel/linux/kni/compat.h
>>>> +++ b/kernel/linux/kni/compat.h
>>>> @@ -117,3 +117,7 @@
>>>>  #if LINUX_VERSION_CODE >= KERNEL_VERSION(4, 11, 0)  #define 
>>>> HAVE_SIGNAL_FUNCTIONS_OWN_HEADER  #endif
>>>> +
>>>> +#if LINUX_VERSION_CODE >= KERNEL_VERSION(4, 10, 0) #define 
>>>> +HAVE_MAX_MTU_PARAM #endif
>>>> diff --git a/kernel/linux/kni/kni_misc.c 
>>>> b/kernel/linux/kni/kni_misc.c index 522ae23..04c78eb 100644
>>>> --- a/kernel/linux/kni/kni_misc.c
>>>> +++ b/kernel/linux/kni/kni_misc.c
>>>> @@ -459,6 +459,9 @@ kni_ioctl_create(struct net *net, uint32_t 
>>>> ioctl_num,
>>>>  
>>>>  	if (dev_info.mtu)
>>>>  		net_dev->mtu = dev_info.mtu;
>>>> +#ifdef HAVE_MAX_MTU_PARAM
>>>> +	net_dev->max_mtu = net_dev->mtu;
>>>> +#endif
>>>
>>> Do we need to set 'max_mtu'? I guess this is not really required for large packet support, if so what do you think making this separate patch?
>>> [L.H.] 'max_mtu' is set by default to '1500', so in order to be able to modify the interface MTU to support jumbo (or even any size > 1500) the 'max_mtu' must be updated to the larger supported value.
>>
>> I missed that it set by default to '1500', I was thinking it is zero by default.
>> Can you please point where its default value set in Linux?
>> [L.H.] I also thought that a zero value will make more sense to provide backwards compatibility, but this is not the case.
>> Here is the code snipped from net/ethernet/eth.c :
>> void ether_setup(struct net_device *dev) {
>> 	dev->header_ops		= &eth_header_ops;
>> 	dev->type		= ARPHRD_ETHER;
>> 	dev->hard_header_len 	= ETH_HLEN;
>> 	dev->min_header_len	= ETH_HLEN;
>> 	dev->mtu		= ETH_DATA_LEN;
>> 	dev->min_mtu		= ETH_MIN_MTU;
>> 	dev->max_mtu		= ETH_DATA_LEN;
>>
> 
> You are right, thanks for the pointer, please go with this update.
> 



More information about the dev mailing list