[dpdk-dev] [dpdk-dev v2 1/2] ethdev: add new tunnel type for ecpri

Zhang, Qi Z qi.z.zhang at intel.com
Tue Jan 12 03:14:07 CET 2021



> -----Original Message-----
> From: Thomas Monjalon <thomas at monjalon.net>
> Sent: Monday, January 11, 2021 10:54 PM
> To: Yigit, Ferruh <ferruh.yigit at intel.com>; Guo, Jia <jia.guo at intel.com>; Zhang,
> Qi Z <qi.z.zhang at intel.com>
> Cc: Andrew Rybchenko <andrew.rybchenko at oktetlabs.ru>; Ori Kam
> <orika at nvidia.com>; Wu, Jingjing <jingjing.wu at intel.com>; Yang, Qiming
> <qiming.yang at intel.com>; Wang, Haiyue <haiyue.wang at intel.com>;
> dev at dpdk.org; Gregory Etelson <getelson at nvidia.com>;
> maxime.coquelin at redhat.com; jerinj at marvell.com;
> ajit.khaparde at broadcom.com; Bing Zhao <bingz at nvidia.com>
> Subject: Re: [dpdk-dev] [dpdk-dev v2 1/2] ethdev: add new tunnel type for ecpri
> 
> 11/01/2021 15:02, Zhang, Qi Z:
> > From: Thomas Monjalon <thomas at monjalon.net>
> > > 11/01/2021 12:26, Zhang, Qi Z:
> > > > From: Thomas Monjalon <thomas at monjalon.net>
> > > > > 10/01/2021 11:46, Ori Kam:
> > > > > > From: Zhang, Qi Z <qi.z.zhang at intel.com>
> > > > > > > From: Thomas Monjalon <thomas at monjalon.net>
> > > > > > > > 08/01/2021 10:29, Andrew Rybchenko:
> > > > > > > > > On 1/8/21 11:57 AM, Ferruh Yigit wrote:
> > > > > > > > > > On 1/8/2021 1:41 AM, Zhang, Qi Z wrote:
> > > > > > > > > >> From: Thomas Monjalon <thomas at monjalon.net>
> > > > > > > > > >>> Yes the port number is free.
> > > > > > > > > >>> But isn't it more natural to specify this port
> > > > > > > > > >>> number as part of the rte_flow rule?
> > > > > > > > > >>
> > > > > > > > > >> I think if we have a rte_flow action type that can be
> > > > > > > > > >> used to set a packet's tunnel type xxx, like below
> > > > > > > > > >> #flow create eth/ipv4/udp port is 4789/... action
> > > > > > > > > >> set_tunnel_type VxLAN / end then we may replace it
> > > > > > > > > >> with rte_flow, but I'm not sure if it's necessary,
> > > > > > > > > >> please share if you have a better idea.
> > > > > > > >
> > > > > > > > Of course we can specify the UDP port in rte_flow rule.
> > > > > > > > Please check rte_flow_item_udp.
> > > > > > > > That's a basic of rte_flow.
> > > > > > >
> > > > > > > Its not about the pattern match, it's about the action, what
> > > > > > > we need is a rte_flow action to "define a packet's tunnel
> > > > > > > type", but we don't
> > > have.
> > > > >
> > > > > A packet type alone is meaningless.
> > > > > It is always associated to an action, this is what rte_flow does.
> > > >
> > > > As I mentioned in previous, this is a device (port) level
> > > > configuration, so it can
> > > only be configured by a PF driver or a privileged VF base on our security
> model.
> > > > A typical usage in a NFV environment could be:
> > > >
> > > > 1. A privileged VF (e.g. ice_dcf PMD) use
> > > > rte_eth_dev_udp_tunnel_port_add
> > > to create tunnel port for eCPRI, them this will impact on all VFs in the same
> PF.
> > > > 2. A normal VF driver can create rte_flow rule that match specific
> > > > patch for
> > > queue steering or apply RSS for eCPRI packets, but it DON'T have the
> > > permission to define the tunnel port.
> > >
> > > Whaooh! A normal Intel VF is not allowed to match the tunnel it
> > > wants if not enabled by a priviledged VF?
> >
> > > I would say it is a HW design flaw, but that's not the question.
> >
> > Why you think this is a design flaw? in real case, is it a typical
> > requirement that different VF need different tunnel port for eCPRI (or
> > VxLan) on the same PF?
> 
> They are different VFs, so why should they use the same UDP port?
> Anyway it doesn't need to be typical to be allowed.

Yes, of cause, your can support different UDP tunnel port for different VF, but there are lots of alternative ways to isolate VFs, its just not a big deal for most real use case.
The typical requirement is some customer want eCPRI with UDP port A, while another one want UDP port B, and our NIC is good enough to support both cases separately.
There are seldom cases that different eCPRI tunnel port need to be deployed on the same NIC or same port.
so from my view, it's a reasonable design compromise that lose minor software flexibility but get a more simplified firmware and save more hardware resource from unnecessary usage.

> 
> > I believe it's not necessary to make it as a per VF resource in most
> > cases, and I will be surprise if a driver that allow any VF to change
> > the share resource without any privilege control.
> 
> The thing is that a flow rule should not be a shared resource.
> In Intel devices, it seems the UDP port of a protocol is supposed to be shared
> with all VFs, but it looks a very specific assumption, or limitation.
> I wonder how we can document this and ask the user to call
> rte_eth_dev_udp_tunnel_port_add(), because of some devices.
> Anyway, this is currently poorly documented.

OK, let me check the document to see if anything we can improve.

> 
> > Btw I guess mlx NIC has more flexible way to handle ecpri tunnel, just
> > curious how it works, what's the expected result of below rules?
> >
> > 1. create flow eth / ipv4 / udp dst is 1234 / ecpri msgtype is 0 / ...
> > to queue 0 2. create flow eth / ipv4 / udp dst is 5678 / ecrpi msgtype is 1 / ...
> to queue 1.
> 
> It should move the eCPRI packets to the right queue, taking into consideration
> the UDP port and the message type.
> Of course there may be some bugs :)

I guess it is not just some bugs, I saw below note in Mellanox latest MLX5 driver.
"eCPRI over UDP layer is not yet supported right now",  
but this is not the question, I believe your answers are all fit for the VxLan case :)

For VxLAN offload I note below statement from your user manual

*If you configure multiple UDP ports for offload and exceed the total number of ports supported by hardware, then those additional ports will
still function properly, but will not benefit from any of the stateless offloads. 

Looks like you have a port limitation, additional port that above this number will not work with offload like RSS/steering ...,that's fine.
So my understanding the port resource is not just a regular rule in your general flow table.
The questions is how many is the limitation ?  does each VF has its own resource pool? 
If they are shared, how do you manage these ports? 
What if one malicious VF used up all the tunnel ports, does another VF still get chance to create its own?

> 
> > So both 1234 and 5678 will be regarded as an ECPRI packet?
> 
> Yes, both should be considered eCPRI.
> 
> > Or only the first one will work?
> 
> I am not aware of such limitation.
> 
> > does dst udp port is always needed if an ecpri pattern is involved?
> 
> No, the UDP part is optional.
> 



More information about the dev mailing list