[PATCH 1/2] net/txgbe: add vectorized functions for Rx/Tx
    Ferruh Yigit 
    ferruh.yigit at amd.com
       
    Thu Mar 21 17:21:57 CET 2024
    
    
  
On 3/5/2024 8:10 AM, Jiawen Wu wrote:
> On Wed, Feb 7, 2024 11:13 AM, Ferruh.Yigit at amd.com wrote:
>> On 2/1/2024 3:00 AM, Jiawen Wu wrote:
>>> To optimize Rx/Tx burst process, add SSE/NEON vector instructions on
>>> x86/arm architecture.
>>>
>>
>> Do you have any performance improvement number with vector
>> implementation, if so can you put it into commit log for record?
> 
> On our local x86 platforms, the performance was at full speed without
> using vector. So we don't have the performance improvement number
> with SSE yet. But I will add the test result for arm.
> 
Ack
>>> @@ -2198,8 +2220,15 @@ txgbe_set_tx_function(struct rte_eth_dev *dev, struct txgbe_tx_queue *txq)
>>>  #endif
>>>  			txq->tx_free_thresh >= RTE_PMD_TXGBE_TX_MAX_BURST) {
>>>  		PMD_INIT_LOG(DEBUG, "Using simple tx code path");
>>> -		dev->tx_pkt_burst = txgbe_xmit_pkts_simple;
>>>  		dev->tx_pkt_prepare = NULL;
>>> +		if (txq->tx_free_thresh <= RTE_TXGBE_TX_MAX_FREE_BUF_SZ &&
>>> +				(rte_eal_process_type() != RTE_PROC_PRIMARY ||
>>>
>>
>> Why vector Tx enable only for secondary process?
> 
> It is not only for secondary process. The constraint is
> 
> (rte_eal_process_type() != RTE_PROC_PRIMARY || txgbe_txq_vec_setup(txq) == 0)
> 
> This code references ixgbe, which explains:
> "When using multiple processes, the TX function used in all processes
>  should be the same, otherwise the secondary processes cannot transmit
>  more than tx-ring-size - 1 packets.
>  To achieve this, we extract out the code to select the ixgbe TX function
>  to be used into a separate function inside the ixgbe driver, and call
>  that from a secondary process when it is attaching to an
>  already-configured NIC."
> 
Got it,
1- Is txgbe has the constraint that same Tx function should be used
separate queues?
Tx functions is all in SW, right? HW interface is same, so HW doesn't
know or care vector Tx or simple Tx is used.
As primary and secondary processes manage different queues, I don't know
why this constraint exists.
2. I see above logic prevents secondary to call 'txgbe_txq_vec_setup()'
again. Perhaps unlikely but technically, if 'txgbe_txq_vec_setup()'
fails for primary 'txgbe_xmit_pkts_simple' is set and for secondary
'txgbe_xmit_pkts_vec' is set, causing both primary and secondary have
different Tx functions, can you please check if this option is valid.
There are other comments not addressed, I assume they are accepted and
there will be a new version, but I want to highlight in case they are
missed.
    
    
More information about the dev
mailing list