[dpdk-dev] [PATCH v3 5/6] ixgbe: Config VF RSS

Ouyang, Changchun changchun.ouyang at intel.com
Fri Dec 26 03:07:55 CET 2014



> -----Original Message-----
> From: Vlad Zolotarov [mailto:vladz at cloudius-systems.com]
> Sent: Thursday, December 25, 2014 9:14 PM
> To: Ouyang, Changchun; dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v3 5/6] ixgbe: Config VF RSS
> 
> 
> On 12/25/14 04:14, Ouyang, Changchun wrote:
> > Hi,
> >
> >> -----Original Message-----
> >> From: Vlad Zolotarov [mailto:vladz at cloudius-systems.com]
> >> Sent: Wednesday, December 24, 2014 6:40 PM
> >> To: Ouyang, Changchun; dev at dpdk.org
> >> Subject: Re: [dpdk-dev] [PATCH v3 5/6] ixgbe: Config VF RSS
> >>
> >>
> >> On 12/24/14 07:23, Ouyang Changchun wrote:
> >>> It needs config RSS and IXGBE_MRQC and IXGBE_VFPSRTYPE to enable
> VF
> >> RSS.
> >>> The psrtype will determine how many queues the received packets will
> >>> distribute to, and the value of psrtype should depends on both facet:
> >>> max VF rxq number which has been negotiated with PF, and the number
> >>> of
> >> rxq specified in config on guest.
> >>> Signed-off-by: Changchun Ouyang <changchun.ouyang at intel.com>
> >>> ---
> >>>    lib/librte_pmd_ixgbe/ixgbe_pf.c   | 15 +++++++
> >>>    lib/librte_pmd_ixgbe/ixgbe_rxtx.c | 92
> >> ++++++++++++++++++++++++++++++++++-----
> >>>    2 files changed, 97 insertions(+), 10 deletions(-)
> >>>
> >>> diff --git a/lib/librte_pmd_ixgbe/ixgbe_pf.c
> >>> b/lib/librte_pmd_ixgbe/ixgbe_pf.c index cbb0145..9c9dad8 100644
> >>> --- a/lib/librte_pmd_ixgbe/ixgbe_pf.c
> >>> +++ b/lib/librte_pmd_ixgbe/ixgbe_pf.c
> >>> @@ -187,6 +187,21 @@ int ixgbe_pf_host_configure(struct rte_eth_dev
> >> *eth_dev)
> >>>    	IXGBE_WRITE_REG(hw, IXGBE_MPSAR_LO(hw-
> mac.num_rar_entries), 0);
> >>>    	IXGBE_WRITE_REG(hw, IXGBE_MPSAR_HI(hw-
> mac.num_rar_entries), 0);
> >>>
> >>> +	/*
> >>> +	 * VF RSS can support at most 4 queues for each VF, even if
> >>> +	 * 8 queues are available for each VF, it need refine to 4
> >>> +	 * queues here due to this limitation, otherwise no queue
> >>> +	 * will receive any packet even RSS is enabled.
> >> According to Table 7-3 in the 82599 spec RSS is not available when
> >> port is configured to have 8 queues per pool. This means that if u
> >> see this configuration u may immediately disable RSS flow in your code.
> >>
> > 8 queues here means the available number queue per vf, it is
> > calculated according to max vfs, e.g. if max vfs is 16(or less than),
> > then each vf 'COULD' have 8 queues evenly, pf early init stage estimate this
> value, but that is not precise, so need refine this.
> > User don't know this estimated value, it is internal value, not come from
> user's input/configure.
> > Hope it is clear to you.
> >>> +	 */
> >>> +	if (eth_dev->data->dev_conf.rxmode.mq_mode ==
> >> ETH_MQ_RX_VMDQ_RSS) {
> >>> +		if (RTE_ETH_DEV_SRIOV(eth_dev).nb_q_per_pool == 8) {
> >>> +			RTE_ETH_DEV_SRIOV(eth_dev).active =
> >> ETH_32_POOLS;
> >>> +			RTE_ETH_DEV_SRIOV(eth_dev).nb_q_per_pool = 4;
> >>> +			RTE_ETH_DEV_SRIOV(eth_dev).def_pool_q_idx =
> >>> +				dev_num_vf(eth_dev) * 4;
> >> According to 82599 spec u can't do that since RSS is not allowed when
> >> port is configured to have 8 function per-VF. Have u verified that
> >> this works? If yes, then spec should be updated.
> >>
> > Response as above,
> > Of course I have validated this. It works well.
> >
> >>> +		}
> >>> +	}
> >>> +
> >>>    	/* set VMDq map to default PF pool */
> >>>    	hw->mac.ops.set_vmdq(hw, 0,
> >>> RTE_ETH_DEV_SRIOV(eth_dev).def_vmdq_idx);
> >>>
> >>> diff --git a/lib/librte_pmd_ixgbe/ixgbe_rxtx.c
> >>> b/lib/librte_pmd_ixgbe/ixgbe_rxtx.c
> >>> index f69abda..a7c17a4 100644
> >>> --- a/lib/librte_pmd_ixgbe/ixgbe_rxtx.c
> >>> +++ b/lib/librte_pmd_ixgbe/ixgbe_rxtx.c
> >>> @@ -3327,6 +3327,39 @@ ixgbe_alloc_rx_queue_mbufs(struct
> >> igb_rx_queue *rxq)
> >>>    }
> >>>
> >>>    static int
> >>> +ixgbe_config_vf_rss(struct rte_eth_dev *dev) {
> >>> +	struct ixgbe_hw *hw;
> >>> +	uint32_t mrqc;
> >>> +
> >>> +	ixgbe_rss_configure(dev);
> >>> +
> >>> +	hw = IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private);
> >>> +
> >>> +	/* MRQC: enable VF RSS */
> >>> +	mrqc = IXGBE_READ_REG(hw, IXGBE_MRQC);
> >>> +	mrqc &= ~IXGBE_MRQC_MRQE_MASK;
> >>> +	switch (RTE_ETH_DEV_SRIOV(dev).active) {
> >>> +	case ETH_64_POOLS:
> >>> +		mrqc |= IXGBE_MRQC_VMDQRSS64EN;
> >>> +		break;
> >>> +
> >>> +	case ETH_32_POOLS:
> >>> +	case ETH_16_POOLS:
> >>> +		mrqc |= IXGBE_MRQC_VMDQRSS32EN;
> >> Again, this contradicts with the spec.
> >>
> >>> +		break;
> >>> +
> >>> +	default:
> >>> +		PMD_INIT_LOG(ERR, "Invalid pool number in IOV mode");
> >>> +		return -EINVAL;
> >>> +	}
> >>> +
> >>> +	IXGBE_WRITE_REG(hw, IXGBE_MRQC, mrqc);
> >>> +
> >>> +	return 0;
> >>> +}
> >>> +
> >>> +static int
> >>>    ixgbe_dev_mq_rx_configure(struct rte_eth_dev *dev)
> >>>    {
> >>>    	struct ixgbe_hw *hw =
> >>> @@ -3358,24 +3391,38 @@ ixgbe_dev_mq_rx_configure(struct
> >> rte_eth_dev *dev)
> >>>    			default: ixgbe_rss_disable(dev);
> >>>    		}
> >>>    	} else {
> >>> -		switch (RTE_ETH_DEV_SRIOV(dev).active) {
> >>>    		/*
> >>>    		 * SRIOV active scheme
> >>>    		 * FIXME if support DCB/RSS together with VMDq & SRIOV
> >>>    		 */
> >>> -		case ETH_64_POOLS:
> >>> -			IXGBE_WRITE_REG(hw, IXGBE_MRQC,
> >> IXGBE_MRQC_VMDQEN);
> >>> +		switch (dev->data->dev_conf.rxmode.mq_mode) {
> >>> +		case ETH_MQ_RX_RSS:
> >>> +		case ETH_MQ_RX_VMDQ_RSS:
> >>> +			ixgbe_config_vf_rss(dev);
> >>>    			break;
> >>>
> >>> -		case ETH_32_POOLS:
> >>> -			IXGBE_WRITE_REG(hw, IXGBE_MRQC,
> >> IXGBE_MRQC_VMDQRT4TCEN);
> >>> -			break;
> >>> +		default:
> >>> +			switch (RTE_ETH_DEV_SRIOV(dev).active) {
> >> Sorry for nitpicking but have u considered taking this encapsulated
> >> "switch- case" block into a separate function? This could make the
> >> code look a lot nicer. ;)
> > Only one place use it, so don't need make it a function, And I prefer
> > to the current code.
> 
> Functions may be used not only to have a repeatedly called code but also to
> make a caller code more readable. Encapsulated switch-case is one of the
> examples of a *not* readable code constructs which should be avoided.
> 
> >
> >>> +			case ETH_64_POOLS:
> >>> +				IXGBE_WRITE_REG(hw, IXGBE_MRQC,
> >>> +					IXGBE_MRQC_VMDQEN);
> >>> +				break;
> >>>
> >>> -		case ETH_16_POOLS:
> >>> -			IXGBE_WRITE_REG(hw, IXGBE_MRQC,
> >> IXGBE_MRQC_VMDQRT8TCEN);
> >>> +			case ETH_32_POOLS:
> >>> +				IXGBE_WRITE_REG(hw, IXGBE_MRQC,
> >>> +					IXGBE_MRQC_VMDQRT4TCEN);
> >>> +				break;
> >>> +
> >>> +			case ETH_16_POOLS:
> >>> +				IXGBE_WRITE_REG(hw, IXGBE_MRQC,
> >>> +					IXGBE_MRQC_VMDQRT8TCEN);
> >>> +				break;
> >>> +			default:
> >>> +				PMD_INIT_LOG(ERR,
> >>> +					"invalid pool number in IOV mode");
> >>> +				break;
> >>> +			}
> >>>    			break;
> >>> -		default:
> >>> -			PMD_INIT_LOG(ERR, "invalid pool number in IOV
> >> mode");
> >>>    		}
> >>>    	}
> >>>
> >>> @@ -3989,10 +4036,32 @@ ixgbevf_dev_rx_init(struct rte_eth_dev
> *dev)
> >>>    	uint16_t buf_size;
> >>>    	uint16_t i;
> >>>    	int ret;
> >>> +	uint16_t valid_rxq_num;
> >>>
> >>>    	PMD_INIT_FUNC_TRACE();
> >>>    	hw = IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private);
> >>>
> >>> +	valid_rxq_num = RTE_MIN(dev->data->nb_rx_queues,
> >>> +hw->mac.max_rx_queues);
> >>> +
> >>> +	/*
> >>> +	 * VMDq RSS can't support 3 queues, so config it into 4 queues,
> >>> +	 * and give user a hint that some packets may loss if it doesn't
> >>> +	 * poll the queue where those packets are distributed to.
> >>> +	 */
> >>> +	if (valid_rxq_num == 3)
> >>> +		valid_rxq_num = 4;
> >> Why to configure more queues that requested and not less (2)? Why to
> >> configure anything at all and not return an error?
> >>
> >>> +
> >>> +	if (dev->data->nb_rx_queues > valid_rxq_num) {
> >>> +		PMD_INIT_LOG(ERR, "The number of Rx queue invalid, "
> >>> +			"it should be equal to or less than %d",
> >>> +			valid_rxq_num);
> >>> +		return -1;
> >>> +	} else if (dev->data->nb_rx_queues < valid_rxq_num)
> >>> +		PMD_INIT_LOG(ERR, "The number of Rx queue is less "
> >>> +			"than the number of available Rx queues:%d, "
> >>> +			"packets in Rx queues(q_id >= %d) may loss.",
> >>> +			valid_rxq_num, dev->data->nb_rx_queues);
> >> Who ever looks in the "INIT_LOG" if everything "work well" and u make it
> >> look so by allowing this call to succeed. And then some packets will just
> >> silently not arrive?! And what the used should somehow guess to do?
> >> - Look in the "INIT_LOG"?! This is a nightmare!
> > Sorry, I don't think so again, if user find any packets loss, he will care for log,
> > Then he can find that log there, then user can refine its rxq number due
> the wrong rxq number,
> > Why is it a nightmare?
> 
> Because usually u expect that if the function call returns with a
> success it means a success. Why a user has to learn that a device
> configuration function was provided with wrong parameters from the
> packet loss? If parameters are not allowed u expect to get an error as a
> return value. Since when errors are returned in a form of a log message?
> Why do u think there is a living person running a DPDK based
> application? How do u expect somebody build an automated environment
> when part of errors are returned in some log? Should he/she add a log
> parser?
If it is automated environment, the log parsing should also be automated,
At least it should report to living person.  But this is another topic we could stop here. Let's focus on vf rss

> On the other hand, why do u think 4 queues is a better option for a user
> than 2 queue when he asked for 3 queues? What kind of heuristics is that?
> 
> To summarize - it would be much better if u just returned an EINVAL
> error in that case.

As I said, it seems you have strong reason to use 2 queues here, and return error
If user try to use 3 queues, I will consider it in v4, but still need insights from more guys here.
> 
> >
> > I don't agree with you about "silently not arrive", because we have hint/log
> there.
> >
> > Return error here is also possible way,
> 
> It's the only possible way! ;)
> 
> > Again need other guys' insight here.
> >
> >>> +
> >>>    	/*
> >>>    	 * When the VF driver issues a IXGBE_VF_RESET request, the PF
> >> driver
> >>>    	 * disables the VF receipt of packets if the PF MTU is > 1500.
> >>> @@ -4094,6 +4163,9 @@ ixgbevf_dev_rx_init(struct rte_eth_dev *dev)
> >>>    			IXGBE_PSRTYPE_IPV6HDR;
> >>>    #endif
> >>>
> >>> +	/* Set RQPL for VF RSS according to max Rx queue */
> >>> +	psrtype |= (valid_rxq_num >> 1) <<
> >>> +		IXGBE_PSRTYPE_RQPL_SHIFT;
> >>>    	IXGBE_WRITE_REG(hw, IXGBE_VFPSRTYPE, psrtype);
> >>>
> >>>    	if (dev->data->dev_conf.rxmode.enable_scatter) {



More information about the dev mailing list