[dpdk-dev] [RFC PATCH 0/6] General tunneling APIs

Walukiewicz, Miroslaw Miroslaw.Walukiewicz at intel.com
Wed Dec 23 12:17:33 CET 2015


Hi Jijang,

I like an idea of tunnel API very much. 

I have a few questions. 

1. I see that you have only i40e support due to lack of HW tunneling support in other NICs. 
I don't see a way how do you want to handle tunneling requests for NICs without HW offload. 

I think that we should have one common function for sending tunneled packets but the initialization should check the NIC capabilities and call some registered function making tunneling in SW in case of lack of HW support.

I know that making tunnel is very time consuming process, but it makes an API more generic. Similar only 3 protocols are supported by i40e by HW and we can imagine about 40 or more different tunnels working with this NIC. 

Making the SW implementation we could support missing tunnels even for i40e.

2. I understand that we need RX HW queue defined in struct rte_eth_tunnel_conf but why tx_queue is necessary?. 
  As I know i40e HW we can set tunneled packet descriptors in any HW queue and receive only on one specific queue.

3. I see a similar problem with receiving tunneled packets on the single queue only. I know that some NICs like fm10k could make hashing on packets and push same tunnel to many queues. Maybe we should support such RSS like feature in the design also. I know that it is not supported by i40e but it is good to have a more flexible API design. 

4. In your implementation you are assuming the there is one tunnel configured per DPDK interface

rte_eth_dev_tunnel_configure(uint8_t port_id,
+			     struct rte_eth_tunnel_conf *tunnel_conf)

The sense of tunnel is lack of interfaces in the system because number of possible VLANs is too small (4095). 
In the DPDK we have only one tunnel per physical port what is useless even with such big acceleration provided with i40e.

In normal use cases there is a need for 10,000s of tunnels per interface. Even for Vxlan we have 24 bits for tunnel definition

I think that we need a special API for sending like rte_eth_dev_tunnel_send_burst where we will provide some tunnel number allocated by rte_eth_dev_tunnel_configure to avoid setting the tunnel specific information separately in each descriptor .

Same on RX we should provide   in  struct rte_eth_tunnel_conf the callback functions that will make some specific action on received tunnel that could be pushing packet to the user ring or setting the tunnel information in RX descriptor or somewhat else.

5. I see that you have implementations for VXLAN,TEREDO, and GENEVE tunnels in i40e drivers. I could  find the implementation for VXLAN encap/decap. Are all files in the patch present?

6. What about with QinQ HW tunneling also supported by i40e HW. I know that the implementation is present in different place but why not include QinQ as additional tunnel. It would be very nice feature to have all tunnels API in single place.

Regards,

Mirek




> -----Original Message-----
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Jijiang Liu
> Sent: Wednesday, December 23, 2015 9:50 AM
> To: dev at dpdk.org
> Subject: [dpdk-dev] [RFC PATCH 0/6] General tunneling APIs
> 
> I want to define a set of General tunneling APIs, which are used to
> accelarate tunneling packet processing in DPDK,
> In this RFC patch set, I wll explain my idea using some codes.
> 
> 1. Using flow director offload to define a tunnel flow in a pair of queues.
> 
> flow rule: src IP + dst IP + src port + dst port + tunnel ID (for VXLAN)
> 
> For example:
> 	struct rte_eth_tunnel_conf{
> 	.tunnel_type = VXLAN,
> 	.rx_queue = 1,
> 	.tx_queue = 1,
> 	.filter_type = 'src ip + dst ip + src port + dst port + tunnel id'
> 	.flow_tnl {
>          	.tunnel_type = VXLAN,
>          	.tunnel_id = 100,
>          	.remote_mac = 11.22.33.44.55.66,
>          .ip_type = ipv4,
>          .outer_ipv4.src_ip = 192.168.10.1
>          .outer_ipv4.dst_ip = 10.239.129.11
>          .src_port = 1000,
>          .dst_port =2000
> };
> 
> 2. Configure tunnel flow for a device and for a pair of queues.
> 
> rte_eth_dev_tunnel_configure(0, &rte_eth_tunnel_conf);
> 
> In this API, it will call RX decapsulation and TX encapsulation callback
> function if HW doesn't support encap/decap, and
> a space will be allocated for tunnel configuration and store a pointer to this
> new allocated space as dev->post_rx/tx_burst_cbs[].param.
> 
> rte_eth_add_rx_callback(port_id, tunnel_conf.rx_queue,
>                         rte_eth_tunnel_decap, (void *)tunnel_conf);
> rte_eth_add_tx_callback(port_id, tunnel_conf.tx_queue,
>                         rte_eth_tunnel_encap, (void *)tunnel_conf)
> 
> 3. Using rte_vxlan_decap_burst() to do decapsulation of tunneling packet.
> 
> 4. Using rte_vxlan_encap_burst() to do encapsulation of tunneling packet.
>    The 'src ip, dst ip, src port, dst port and  tunnel ID" can be got from tunnel
> configuration.
>    And SIMD is used to accelarate the operation.
> 
> How to use these APIs, there is a example below:
> 
> 1)at config phase
> 
> dev_config(port, ...);
> tunnel_config(port,...);
> ...
> dev_start(port);
> ...
> rx_burst(port, rxq,... );
> tx_burst(port, txq,...);
> 
> 
> 2)at transmitting packet phase
> The only outer src/dst MAC address need to be set for TX tunnel
> configuration in dev->post_tx_burst_cbs[].param.
> 
> In this patch set, I have not finished all of codes, the purpose of sending
> patch set is that I would like to collect more comments and sugestions on
> this idea.
> 
> 
> Jijiang Liu (6):
>   extend rte_eth_tunnel_flow
>   define tunnel flow structure and APIs
>   implement tunnel flow APIs
>   define rte_vxlan_decap/encap
>   implement rte_vxlan_decap/encap
>   i40e tunnel configure
> 
>  drivers/net/i40e/i40e_ethdev.c             |   41 +++++
>  lib/librte_ether/libtunnel/rte_vxlan_opt.c |  251
> ++++++++++++++++++++++++++++
>  lib/librte_ether/libtunnel/rte_vxlan_opt.h |   49 ++++++
>  lib/librte_ether/rte_eth_ctrl.h            |   14 ++-
>  lib/librte_ether/rte_ethdev.h              |   28 +++
>  lib/librte_ether/rte_ethdev.c              |   60 ++
>  5 files changed, 440 insertions(+), 3 deletions(-)
>  create mode 100644 lib/librte_ether/libtunnel/rte_vxlan_opt.c
>  create mode 100644 lib/librte_ether/libtunnel/rte_vxlan_opt.h
> 
> --
> 1.7.7.6



More information about the dev mailing list