[EXT] Re: [PATCH v2 1/3] ethdev: introduce pool sort capability
    Ferruh Yigit 
    ferruh.yigit at xilinx.com
       
    Tue Sep 13 11:28:42 CEST 2022
    
    
  
On 9/7/2022 10:31 PM, Hanumanth Reddy Pothula wrote:
> 
> 
>> -----Original Message-----
>> From: Ferruh Yigit <ferruh.yigit at xilinx.com>
>> Sent: Wednesday, September 7, 2022 4:54 PM
>> To: Hanumanth Reddy Pothula <hpothula at marvell.com>; Ding, Xuan
>> <xuan.ding at intel.com>; Thomas Monjalon <thomas at monjalon.net>; Andrew
>> Rybchenko <andrew.rybchenko at oktetlabs.ru>
>> Cc: dev at dpdk.org; Wu, WenxuanX <wenxuanx.wu at intel.com>; Li, Xiaoyun
>> <xiaoyun.li at intel.com>; stephen at networkplumber.org; Wang, YuanX
>> <yuanx.wang at intel.com>; mdr at ashroe.eu; Zhang, Yuying
>> <yuying.zhang at intel.com>; Zhang, Qi Z <qi.z.zhang at intel.com>;
>> viacheslavo at nvidia.com; Jerin Jacob Kollanukkaran <jerinj at marvell.com>;
>> Nithin Kumar Dabilpuram <ndabilpuram at marvell.com>
>> Subject: Re: [EXT] Re: [PATCH v2 1/3] ethdev: introduce pool sort capability
>>
>> On 9/7/2022 8:02 AM, Hanumanth Reddy Pothula wrote:
>>>
>>>
>>>> -----Original Message-----
>>>> From: Ferruh Yigit <ferruh.yigit at xilinx.com>
>>>> Sent: Tuesday, September 6, 2022 5:48 PM
>>>> To: Hanumanth Reddy Pothula <hpothula at marvell.com>; Ding, Xuan
>>>> <xuan.ding at intel.com>; Thomas Monjalon <thomas at monjalon.net>;
>> Andrew
>>>> Rybchenko <andrew.rybchenko at oktetlabs.ru>
>>>> Cc: dev at dpdk.org; Wu, WenxuanX <wenxuanx.wu at intel.com>; Li, Xiaoyun
>>>> <xiaoyun.li at intel.com>; stephen at networkplumber.org; Wang, YuanX
>>>> <yuanx.wang at intel.com>; mdr at ashroe.eu; Zhang, Yuying
>>>> <yuying.zhang at intel.com>; Zhang, Qi Z <qi.z.zhang at intel.com>;
>>>> viacheslavo at nvidia.com; Jerin Jacob Kollanukkaran
>>>> <jerinj at marvell.com>; Nithin Kumar Dabilpuram
>>>> <ndabilpuram at marvell.com>
>>>> Subject: Re: [EXT] Re: [PATCH v2 1/3] ethdev: introduce pool sort
>>>> capability
>>>>
>>>> On 8/30/2022 1:08 PM, Hanumanth Reddy Pothula wrote:
>>>>>
>>>>>
>>>>>> -----Original Message-----
>>>>>> From: Ferruh Yigit <ferruh.yigit at xilinx.com>
>>>>>> Sent: Wednesday, August 24, 2022 9:04 PM
>>>>>> To: Ding, Xuan <xuan.ding at intel.com>; Hanumanth Reddy Pothula
>>>>>> <hpothula at marvell.com>; Thomas Monjalon <thomas at monjalon.net>;
>>>> Andrew
>>>>>> Rybchenko <andrew.rybchenko at oktetlabs.ru>
>>>>>> Cc: dev at dpdk.org; Wu, WenxuanX <wenxuanx.wu at intel.com>; Li,
>> Xiaoyun
>>>>>> <xiaoyun.li at intel.com>; stephen at networkplumber.org; Wang, YuanX
>>>>>> <yuanx.wang at intel.com>; mdr at ashroe.eu; Zhang, Yuying
>>>>>> <yuying.zhang at intel.com>; Zhang, Qi Z <qi.z.zhang at intel.com>;
>>>>>> viacheslavo at nvidia.com; Jerin Jacob Kollanukkaran
>>>>>> <jerinj at marvell.com>; Nithin Kumar Dabilpuram
>>>>>> <ndabilpuram at marvell.com>
>>>>>> Subject: [EXT] Re: [PATCH v2 1/3] ethdev: introduce pool sort
>>>>>> capability
>>>>>>
>>>>>> External Email
>>>>>>
>>>>>> -------------------------------------------------------------------
>>>>>> --
>>>>>> -
>>>>>
>>>>>
>>>>> Thanks Ding Xuan and Ferruh Yigit for reviewing the changes and for
>>>>> providing
>>>> your valuable feedback.
>>>>> Please find responses inline.
>>>>>
>>>>>> On 8/23/2022 4:26 AM, Ding, Xuan wrote:
>>>>>>> Hi Hanumanth,
>>>>>>>
>>>>>>>> -----Original Message-----
>>>>>>>> From: Hanumanth Pothula <hpothula at marvell.com>
>>>>>>>> Sent: Saturday, August 13, 2022 1:25 AM
>>>>>>>> To: Thomas Monjalon <thomas at monjalon.net>; Ferruh Yigit
>>>>>>>> <ferruh.yigit at xilinx.com>; Andrew Rybchenko
>>>>>>>> <andrew.rybchenko at oktetlabs.ru>
>>>>>>>> Cc: dev at dpdk.org; Ding, Xuan <xuan.ding at intel.com>; Wu, WenxuanX
>>>>>>>> <wenxuanx.wu at intel.com>; Li, Xiaoyun <xiaoyun.li at intel.com>;
>>>>>>>> stephen at networkplumber.org; Wang, YuanX
>> <yuanx.wang at intel.com>;
>>>>>>>> mdr at ashroe.eu; Zhang, Yuying <yuying.zhang at intel.com>; Zhang, Qi
>>>>>>>> Z <qi.z.zhang at intel.com>; viacheslavo at nvidia.com;
>>>>>>>> jerinj at marvell.com; ndabilpuram at marvell.com; Hanumanth Pothula
>>>>>>>> <hpothula at marvell.com>
>>>>>>>> Subject: [PATCH v2 1/3] ethdev: introduce pool sort capability
>>>>>>>>
>>>>>>>> Presently, the 'Buffer Split' feature supports sending multiple
>>>>>>>> segments of the received packet to PMD, which programs the HW to
>>>>>>>> receive the packet in segments from different pools.
>>>>>>>>
>>>>>>>> This patch extends the feature to support the pool sort capability.
>>>>>>>> Some of the HW has support for choosing memory pools based on the
>>>>>>>> packet's size. The pool sort capability allows PMD to choose a
>>>>>>>> memory pool based on the packet's length.
>>>>>>>>
>>>>>>>> This is often useful for saving the memory where the application
>>>>>>>> can create a different pool to steer the specific size of the
>>>>>>>> packet, thus enabling effective use of memory.
>>>>>>>>
>>>>>>>> For example, let's say HW has a capability of three pools,
>>>>>>>>      - pool-1 size is 2K
>>>>>>>>      - pool-2 size is > 2K and < 4K
>>>>>>>>      - pool-3 size is > 4K
>>>>>>>> Here,
>>>>>>>>             pool-1 can accommodate packets with sizes < 2K
>>>>>>>>             pool-2 can accommodate packets with sizes > 2K and < 4K
>>>>>>>>             pool-3 can accommodate packets with sizes > 4K
>>>>>>>>
>>>>>>>> With pool sort capability enabled in SW, an application may
>>>>>>>> create three pools of different sizes and send them to PMD.
>>>>>>>> Allowing PMD to program HW based on packet lengths. So that
>>>>>>>> packets with less than 2K are received on pool-1, packets with
>>>>>>>> lengths between 2K and 4K are received on pool-2 and finally
>>>>>>>> packets greater than 4K are received on pool-
>>>>>> 3.
>>>>>>>>
>>>>>>>> The following two capabilities are added to the
>>>>>>>> rte_eth_rxseg_capa structure, 1. pool_sort --> tells pool sort capability
>> is supported by HW.
>>>>>>>> 2. max_npool --> max number of pools supported by HW.
>>>>>>>>
>>>>>>>> Defined new structure rte_eth_rxseg_sort, to be used only when
>>>>>>>> pool sort capability is present. If required this may be extended
>>>>>>>> further to support more configurations.
>>>>>>>>
>>>>>>>> Signed-off-by: Hanumanth Pothula <hpothula at marvell.com>
>>>>>>>>
>>>>>>>> v2:
>>>>>>>>      - Along with spec changes, uploading testpmd and driver changes.
>>>>>>>
>>>>>>> Thanks for CCing. It's an interesting feature.
>>>>>>>
>>>>>>> But I have one question here:
>>>>>>> Buffer split is for split receiving packets into multiple
>>>>>>> segments, while pool sort supports PMD to put the receiving
>>>>>>> packets into different pools
>>>>>> according to packet size.
>>>>>>> Every packet is still intact.
>>>>>>>
>>>>>>> So, at this level, pool sort does not belong to buffer split.
>>>>>>> And you already use a different function to check pool sort rather
>>>>>>> than check
>>>>>> buffer split.
>>>>>>>
>>>>>>> Should a new RX offload be introduced? like
>>>>>> "RTE_ETH_RX_OFFLOAD_POOL_SORT".
>>>>>>>
>>>>> Please find my response below.
>>>>>>
>>>>>> Hi Hanumanth,
>>>>>>
>>>>>> I had the similar concern with the feature. I assume you want to
>>>>>> benefit from exiting config structure that gets multiple mempool as
>>>>>> argument, since this feature also needs multiple mempools, but the
>>>>>> feature is
>>>> different.
>>>>>>
>>>>>> It looks to me wrong to check 'OFFLOAD_BUFFER_SPLIT' offload to
>>>>>> decide if to receive into multiple mempool or not, which doesn't
>>>>>> have
>>>> anything related split.
>>>>>> Also not sure about using the 'sort' keyword.
>>>>>> What do you think to introduce new fetaure, instead of extending
>>>>>> existing split one?
>>>>>
>>>>> Actually we thought both BUFFER_SPLIT and POOL_SORT are similar
>>>>> features where RX pools are configured in certain way and thought
>>>>> not use up one more RX offload capability, as the existing software
>>>>> architecture
>>>> can be extended to support pool_sort capability.
>>>>> Yes, as part of pool sort, there is no buffer split but pools are
>>>>> picked based on
>>>> the buffer length.
>>>>>
>>>>> Since you think it's better to use new RX offload for POOL_SORT,
>>>>> will go ahead
>>>> and implement the same.
>>>>>
>>>>>> This is optimisation, right? To enable us to use less memory for
>>>>>> the packet buffer, does it qualify to a device offload?
>>>>>>
>>>>> Yes, its qualify as a device offload and saves memory.
>>>>> Marvel NIC has a capability to receive packets on  two different
>>>>> pools based
>>>> on its length.
>>>>> Below explained more on the same.
>>>>>>
>>>>>> Also, what is the relation with segmented Rx, how a PMD decide to
>>>>>> use segmented Rx or bigger mempool? How can application can configure
>> this?
>>>>>>
>>>>>> Need to clarify the rules, based on your sample, if a 512 bytes
>>>>>> packet received, does it have to go pool-1, or can it go to any of three
>> pools?
>>>>>>
>>>>> Here, Marvell NIC supports two HW pools, SPB(small packet buffer)
>>>>> pool and
>>>> LPB(large packet buffer) pool.
>>>>> SPB pool can hold up to 4KB
>>>>> LPB pool can hold anything more than 4KB Smaller packets are
>>>>> received on SPB pool and larger packets on LPB pool, based on the RQ
>> configuration.
>>>>> Here, in our case HW pools holds whole packet. So if a packet is
>>>>> divided into segments, lower layer HW going to receive all segments
>>>>> of the packet and then going to place the whole packet in SPB/LPB
>>>>> pool, based
>>>> on the packet length.
>>>>>
>>>>
>>>> If the packet is bigger than 4KB, you have two options,
>>>> 1- Use multiple chained buffers in SPB
>>>> 2- Use single LPB buffer
>>>>
>>>> As I understand (2) is used in this case, but I think we should
>>>> clarify how this feature works with 'RTE_ETH_RX_OFFLOAD_SCATTER'
>>>> offload, if it is requested by user.
>>>>
>>>> Or lets say HW has two pools with 1K and 2K sizes, what is expected
>>>> with 4K packet, with or without scattered Rx offload?
>>>>
>>>
>>> As mentioned, Marvell supports two pools, pool-1(SPB) and pool-2(LPB)
>>> If the packet length is within pool-1 length and has only one segment then the
>> packet is allocated from pool-1.
>>> If the packet length is greater than pool-1 or has more than one segment then
>> the packet is allocated from pool-2.
>>>
>>> So, here packets with a single segment and length less than 1K are
>>> allocated from pool-1 and packets with multiple segments or packets with
>> length greater than 1K are allocated from pool-2.
>>>
>>
>> To have multiple segment or not is HW configuration, it is not external variable.
>> Drivers mostly decide to configure HW to receive multiple segment or not based
>> on buffer size and max packet size device support.
>> In this case since buffer size is not fixed, there are multiple buffer sizes, how
>> driver will configure HW?
>>
>> This is not specific to Marvell HW, for the case multiple mempool supported, it is
>> better to clarify in this patch how it is works with
>> 'RTE_ETH_RX_OFFLOAD_SCATTER' offload.
>>
> 
> Here, application sends multiple pools with different buffer lengths to PMD and PMD further programs HW depending on its architecture.
> Similarly, in this case, where multiple HW pools are present,  with 'RTE_ETH_RX_OFFLOAD_SCATTER' enabled, PMD receives packets/segments based on the HW architecture.
> Depending the architecture, if any extra programming is required(either to implement some logic or HW programming) in RX path, its to be implemented in that NIC PMD.
My intention is to clarify the relation between features, like Andrew 
suggested to have Rx segmentation on any pools including mixture of 
pools, that is an option, only please document this in the API 
documentation so that it is clear for future PMD/app developers.
> As far as multiple pool support is considered, in Marvell case, once HW pools are programmed in control path, there is nothing to be done on fast path.
> 
Out of curiosity, how this is transparent in fast path. Normally mbuf 
will point to a data buffer, if there are multiple pools, how this is 
not effected?
Is there any redirection, like mbuf buffer points to some kind of 
metedata (event etc..)?
> So, I think, It depends on HW architecture of how multiple pool and multiple segment receive is implemented.
> 
> Please suggest, if I am missing any generic scenario(use case) here.
> 
>>>>> As pools are picked based on the packets length we used SORT term.
>>>>> In case
>>>> you have any better term(word), please suggest.
>>>>>
>>>>
>>>> what about multiple pool, like RTE_ETH_RX_OFFLOAD_MULTIPLE_POOL, I
>>>> think it is more clear but I would like to get more comments from
>>>> others, naming is hard ;)
>>>>
>>> Yes, RTE_ETH_RX_OFFLOAD_MULTIPLE_POOL is clearer than
>> RTE_ETH_RX_OFFLOAD_SORT_POOL.
>>> Thanks for the suggestion.
>>> Will upload V4 with RTE_ETH_RX_OFFLOAD_MULTIPLE_POOL.
>>>
>>>>>>
>>>>>> And I don't see any change in the 'net/cnxk' Rx burst code, when
>>>>>> multiple mempool used, while filling the mbufs shouldn't it check
>>>>>> which mempool is filled. How this works without update in the Rx
>>>>>> burst code, or am I missing some implementation detail?
>>>>>>
>>>>> Please find PMD changes in patch [v2,3/3] net/cnxk: introduce pool
>>>>> sort capability Here, in control path, HW pools are programmed based
>>>>> on the
>>>> inputs it received from the application.
>>>>> Once the HW is programmed, packets are received on HW pools based
>>>>> the
>>>> packets sizes.
>>>>
>>>> I was expecting to changes in datapath too, something like in Rx
>>>> burst function check if spb or lpb is used and update mbuf pointers
>> accordingly.
>>>> But it seems HW doesn't work this way, can you please explain how
>>>> this feature works transparent to datapath code?
>>>>
>>>>>>
>>>>>
>>>>> I will upload V3 where POOL_SORT is implemented as new RX OFFLOAD,
>>>>> unless
>>>> If you have any other suggestion/thoughts.
>>>>>
>>>>
>>>> <...>
>>>
> 
    
    
More information about the dev
mailing list