[dpdk-dev] [PATCH] examples/l3fwd: fix using packet type blindly
Ananyev, Konstantin
konstantin.ananyev at intel.com
Tue Mar 1 14:51:31 CET 2016
Hi Jianfeng,
> -----Original Message-----
> From: Tan, Jianfeng
> Sent: Tuesday, March 01, 2016 1:24 AM
> To: dev at dpdk.org
> Cc: Zhang, Helin; Ananyev, Konstantin; nelio.laranjeiro at 6wind.com; adrien.mazarguil at 6wind.com; rahul.lakkireddy at chelsio.com;
> pmatilai at redhat.com; Tan, Jianfeng
> Subject: [PATCH] examples/l3fwd: fix using packet type blindly
>
> As a example to use ptype info, l3fwd needs firstly to use
> rte_eth_dev_get_ptype_info() API to check if device and/or
> its PMD driver will parse and fill the needed packet type;
> if not, use the newly added option, --parse-ptype, to
> analyze it in the callback softly.
>
> As the mode of EXACT_MATCH uses the 5 tuples to caculate
> hash, so we narrow down its scope to:
> a. ip packets with no extensions, and
> b. L4 payload should be either tcp or udp.
>
> Note: this patch does not completely solve the issue, "cannot
> run l3fwd on virtio or other devices", because hw_ip_checksum
> may be not supported by the devices. Currently we can: option
> 1, remove this requirements; option 2, wait for virtio front
> end (pmd) to support it.
>
> Signed-off-by: Jianfeng Tan <jianfeng.tan at intel.com>
> ---
> doc/guides/rel_notes/release_16_04.rst | 5 ++
> doc/guides/sample_app_ug/l3_forward.rst | 6 ++-
> examples/l3fwd/l3fwd.h | 12 +++++
> examples/l3fwd/l3fwd_em.c | 94 +++++++++++++++++++++++++++++++++
> examples/l3fwd/l3fwd_lpm.c | 57 ++++++++++++++++++++
> examples/l3fwd/main.c | 50 ++++++++++++++++++
> 6 files changed, 223 insertions(+), 1 deletion(-)
>
> diff --git a/doc/guides/rel_notes/release_16_04.rst b/doc/guides/rel_notes/release_16_04.rst
> index 64e913d..4d6260e 100644
> --- a/doc/guides/rel_notes/release_16_04.rst
> +++ b/doc/guides/rel_notes/release_16_04.rst
> @@ -68,6 +68,11 @@ This section should contain bug fixes added to the relevant sections. Sample for
> vhost-switch often fails to allocate mbuf when dequeue from vring because it
> wrongly calculates the number of mbufs needed.
>
> +* **examples/l3fwd: Fixed using packet type blindly.**
> +
> + l3fwd makes use of packet type information without even query if devices or PMDs
> + really set it. For those don't set ptypes, add an option to parse it softly.
> +
>
> EAL
> ~~~
> diff --git a/doc/guides/sample_app_ug/l3_forward.rst b/doc/guides/sample_app_ug/l3_forward.rst
> index 4ce734b..e0c22e3 100644
> --- a/doc/guides/sample_app_ug/l3_forward.rst
> +++ b/doc/guides/sample_app_ug/l3_forward.rst
> @@ -93,7 +93,7 @@ The application has a number of command line options:
>
> .. code-block:: console
>
> - ./build/l3fwd [EAL options] -- -p PORTMASK [-P] --config(port,queue,lcore)[,(port,queue,lcore)] [--enable-jumbo [--max-pkt-len
> PKTLEN]] [--no-numa][--hash-entry-num][--ipv6]
> + ./build/l3fwd [EAL options] -- -p PORTMASK [-P] --config(port,queue,lcore)[,(port,queue,lcore)] [--enable-jumbo [--max-pkt-len
> PKTLEN]] [--no-numa][--hash-entry-num][--ipv6] [--parse-ptype]
>
> where,
>
> @@ -114,6 +114,8 @@ where,
>
> * --ipv6: optional, set it if running ipv6 packets
>
> +* --parse-ptype: optional, set it if use software way to analyze packet type
> +
> For example, consider a dual processor socket platform where cores 0-7 and 16-23 appear on socket 0, while cores 8-15 and 24-31
> appear on socket 1.
> Let's say that the programmer wants to use memory from both NUMA nodes, the platform has only two ports, one connected to
> each NUMA node,
> and the programmer wants to use two cores from each processor socket to do the packet processing.
> @@ -334,6 +336,8 @@ The key code snippet of simple_ipv4_fwd_4pkts() is shown below:
>
> The simple_ipv6_fwd_4pkts() function is similar to the simple_ipv4_fwd_4pkts() function.
>
> +Known issue: IP packets with extensions or IP packets which are not TCP/UDP cannot work well with this mode.
> +
> Packet Forwarding for LPM-based Lookups
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>
> diff --git a/examples/l3fwd/l3fwd.h b/examples/l3fwd/l3fwd.h
> index da6d369..966c83b 100644
> --- a/examples/l3fwd/l3fwd.h
> +++ b/examples/l3fwd/l3fwd.h
> @@ -198,6 +198,18 @@ void
> setup_hash(const int socketid);
>
> int
> +em_check_ptype(int portid);
> +
> +int
> +lpm_check_ptype(int portid);
> +
> +void
> +em_parse_ptype(struct rte_mbuf *);
> +
> +void
> +lpm_parse_ptype(struct rte_mbuf *);
> +
> +int
> em_main_loop(__attribute__((unused)) void *dummy);
>
> int
> diff --git a/examples/l3fwd/l3fwd_em.c b/examples/l3fwd/l3fwd_em.c
> index f6a65d8..1061baf 100644
> --- a/examples/l3fwd/l3fwd_em.c
> +++ b/examples/l3fwd/l3fwd_em.c
> @@ -42,6 +42,7 @@
> #include <errno.h>
> #include <getopt.h>
> #include <stdbool.h>
> +#include <netinet/in.h>
>
> #include <rte_debug.h>
> #include <rte_ether.h>
> @@ -508,6 +509,99 @@ populate_ipv6_many_flow_into_table(const struct rte_hash *h,
> printf("Hash: Adding 0x%x keys\n", nr_flow);
> }
>
> +/* Requirements:
> + * 1. IP packets without extension;
> + * 2. L4 payload should be either TCP or UDP.
> + */
> +int
> +em_check_ptype(int portid)
> +{
> + int i, ret;
> + int ptype_l3_ipv4_ext = 0;
> + int ptype_l3_ipv6_ext = 0;
> + int ptype_l4_tcp = 0;
> + int ptype_l4_udp = 0;
> +
> + ret = rte_eth_dev_get_ptype_info(portid,
> + RTE_PTYPE_L3_MASK | RTE_PTYPE_L4_MASK,
> + NULL, 0);
> + if (ret <= 0)
> + return 0;
> +
> + uint32_t ptypes[ret];
> +
> + ret = rte_eth_dev_get_ptype_info(portid,
> + RTE_PTYPE_L3_MASK | RTE_PTYPE_L4_MASK,
> + ptypes, ret);
> + for (i = 0; i < ret; ++i) {
> + switch (ptypes[i]) {
> + case RTE_PTYPE_L3_IPV4_EXT:
> + ptype_l3_ipv4_ext = 1;
> + break;
> + case RTE_PTYPE_L3_IPV6_EXT:
> + ptype_l3_ipv6_ext = 1;
> + break;
> + case RTE_PTYPE_L4_TCP:
> + ptype_l4_tcp = 1;
> + break;
> + case RTE_PTYPE_L4_UDP:
> + ptype_l4_udp = 1;
> + break;
> + }
> + }
> +
> + if (ptype_l3_ipv4_ext == 0)
> + printf("port %d cannot parse RTE_PTYPE_L3_IPV4_EXT\n", portid);
> + if (ptype_l3_ipv6_ext == 0)
> + printf("port %d cannot parse RTE_PTYPE_L3_IPV6_EXT\n", portid);
> + if (ptype_l3_ipv4_ext && ptype_l3_ipv6_ext)
> + return 1;
Why return here?
You'll miss L4 ptype checks below.
> +
> + if (ptype_l4_tcp == 0)
> + printf("port %d cannot parse RTE_PTYPE_L4_TCP\n", portid);
> + if (ptype_l4_udp == 0)
> + printf("port %d cannot parse RTE_PTYPE_L4_UDP\n", portid);
> + if (ptype_l4_tcp || ptype_l4_udp)
> + return 1;
> +
> + return 0;
> +}
> +
> +void
> +em_parse_ptype(struct rte_mbuf *m)
> +{
> + struct ether_hdr *eth_hdr;
> + uint32_t packet_type = 0;
> + uint16_t ethertype;
> + void *l4;
> + int hdr_len;
> + struct ipv4_hdr *ipv4_hdr;
> + struct ipv6_hdr *ipv6_hdr;
> +
> + eth_hdr = rte_pktmbuf_mtod(m, struct ether_hdr *);
> + ethertype = rte_be_to_cpu_16(eth_hdr->ether_type);
> + l4 = (uint8_t *)eth_hdr + sizeof(struct ether_hdr);
Just curious why l4? It looks like l3 :)
> + switch (ethertype) {
> + case ETHER_TYPE_IPv4:
> + ipv4_hdr = (struct ipv4_hdr *)l4;
> + hdr_len = (ipv4_hdr->version_ihl & IPV4_HDR_IHL_MASK) *
> + IPV4_IHL_MULTIPLIER;
> + if (hdr_len == sizeof(struct ipv4_hdr) &&
> + (ipv4_hdr->next_proto_id == IPPROTO_TCP ||
> + ipv4_hdr->next_proto_id == IPPROTO_UDP))
> + packet_type |= RTE_PTYPE_L3_IPV4;
I think it needs to be something like:
If (hdr_len == sizeof(struct ipv4_hdr)) {
packet_type = RTE_PTYPE_L3_IPV4;
if (ipv4_hdr->next_proto_id == IPPROTO_TCP)
packet_type |= RTE_PTYPE_L4_TCP;
else if ipv4_hdr->next_proto_id == IPPROTO_UDP)
packet_type |= RTE_PTYPE_L4_TCP;
}
And then inside em forward check ptype to be sure that is IPV4 with no options and UDP/TCP packet.
Same for IPv6.
> + break;
> + case ETHER_TYPE_IPv6:
> + ipv6_hdr = (struct ipv6_hdr *)l4;
> + if (ipv6_hdr->proto == IPPROTO_TCP ||
> + ipv6_hdr->proto == IPPROTO_UDP)
> + packet_type |= RTE_PTYPE_L3_IPV6;
> + break;
> + }
> +
> + m->packet_type |= packet_type;
> +}
> +
> /* main processing loop */
> int
> em_main_loop(__attribute__((unused)) void *dummy)
> diff --git a/examples/l3fwd/l3fwd_lpm.c b/examples/l3fwd/l3fwd_lpm.c
> index e0ed3c4..981227a 100644
> --- a/examples/l3fwd/l3fwd_lpm.c
> +++ b/examples/l3fwd/l3fwd_lpm.c
> @@ -280,6 +280,63 @@ setup_lpm(const int socketid)
> }
> }
>
> +int
> +lpm_check_ptype(int portid)
> +{
> + int i, ret;
> + int ptype_l3_ipv4 = 0, ptype_l3_ipv6 = 0;
> +
> + ret = rte_eth_dev_get_ptype_info(portid, RTE_PTYPE_L3_MASK, NULL, 0);
> + if (ret <= 0)
> + return 0;
> +
> + uint32_t ptypes[ret];
> +
> + ret = rte_eth_dev_get_ptype_info(portid, RTE_PTYPE_L3_MASK,
> + ptypes, ret);
> + for (i = 0; i < ret; ++i) {
> + if (ptypes[i] & RTE_PTYPE_L3_IPV4)
> + ptype_l3_ipv4 = 1;
> + if (ptypes[i] & RTE_PTYPE_L3_IPV6)
> + ptype_l3_ipv6 = 1;
> + }
> +
> + if (ptype_l3_ipv4 == 0)
> + printf("port %d cannot parse RTE_PTYPE_L3_IPV4\n", portid);
> +
> + if (ptype_l3_ipv6 == 0)
> + printf("port %d cannot parse RTE_PTYPE_L3_IPV6\n", portid);
> +
> + if (ptype_l3_ipv4 && ptype_l3_ipv6)
> + return 1;
> +
> + return 0;
> +
> +}
> +
> +void
> +lpm_parse_ptype(struct rte_mbuf *m)
> +{
> + struct ether_hdr *eth_hdr;
> + uint32_t packet_type = 0;
> + uint16_t ethertype;
> +
> + eth_hdr = rte_pktmbuf_mtod(m, struct ether_hdr *);
> + ethertype = rte_be_to_cpu_16(eth_hdr->ether_type);
> + switch (ethertype) {
> + case ETHER_TYPE_IPv4:
> + packet_type |= RTE_PTYPE_L3_IPV4_EXT_UNKNOWN;
> + break;
> + case ETHER_TYPE_IPv6:
> + packet_type |= RTE_PTYPE_L3_IPV6_EXT_UNKNOWN;
> + break;
> + default:
> + break;
> + }
> +
> + m->packet_type |= packet_type;
Might be safer:
m->packet_type = packet_type;
> +}
> +
> /* Return ipv4/ipv6 lpm fwd lookup struct. */
> void *
> lpm_get_ipv4_l3fwd_lookup_struct(const int socketid)
> diff --git a/examples/l3fwd/main.c b/examples/l3fwd/main.c
> index 0e33039..8889828 100644
> --- a/examples/l3fwd/main.c
> +++ b/examples/l3fwd/main.c
> @@ -103,6 +103,8 @@ static int l3fwd_lpm_on;
> static int l3fwd_em_on;
>
> static int numa_on = 1; /**< NUMA is enabled by default. */
> +static int parse_ptype; /**< Parse packet type using rx callback, and */
> + /**< disabled by default */
>
> /* Global variables. */
>
> @@ -172,6 +174,8 @@ static struct rte_mempool * pktmbuf_pool[NB_SOCKETS];
>
> struct l3fwd_lkp_mode {
> void (*setup)(int);
> + int (*check_ptype)(int);
> + void (*parse_ptype)(struct rte_mbuf *);
> int (*main_loop)(void *);
> void* (*get_ipv4_lookup_struct)(int);
> void* (*get_ipv6_lookup_struct)(int);
> @@ -181,6 +185,8 @@ static struct l3fwd_lkp_mode l3fwd_lkp;
>
> static struct l3fwd_lkp_mode l3fwd_em_lkp = {
> .setup = setup_hash,
> + .check_ptype = em_check_ptype,
> + .parse_ptype = em_parse_ptype,
> .main_loop = em_main_loop,
> .get_ipv4_lookup_struct = em_get_ipv4_l3fwd_lookup_struct,
> .get_ipv6_lookup_struct = em_get_ipv6_l3fwd_lookup_struct,
> @@ -188,6 +194,8 @@ static struct l3fwd_lkp_mode l3fwd_em_lkp = {
>
> static struct l3fwd_lkp_mode l3fwd_lpm_lkp = {
> .setup = setup_lpm,
> + .check_ptype = lpm_check_ptype,
> + .parse_ptype = lpm_parse_ptype,
> .main_loop = lpm_main_loop,
> .get_ipv4_lookup_struct = lpm_get_ipv4_l3fwd_lookup_struct,
> .get_ipv6_lookup_struct = lpm_get_ipv6_l3fwd_lookup_struct,
> @@ -209,6 +217,22 @@ setup_l3fwd_lookup_tables(void)
> l3fwd_lkp = l3fwd_lpm_lkp;
> }
>
> +static uint16_t
> +cb_parse_packet_type(uint8_t port __rte_unused,
> + uint16_t queue __rte_unused,
> + struct rte_mbuf *pkts[],
> + uint16_t nb_pkts,
> + uint16_t max_pkts __rte_unused,
> + void *user_param __rte_unused)
> +{
> + unsigned i;
> +
> + for (i = 0; i < nb_pkts; ++i)
> + l3fwd_lkp.parse_ptype(pkts[i]);
No need to create callback chains.
That way you have extra call per packet.
Just 2 callbaclks: cb_lpm_parse_ptype & cb_em_parse_ptype,
and then register one depending on which mode are we in.
Would be simpler and faster, I believe.
Konstantin
> +
> + return nb_pkts;
> +}
> +
> static int
> check_lcore_params(void)
> {
> @@ -456,6 +480,7 @@ parse_eth_dest(const char *optarg)
> #define CMD_LINE_OPT_IPV6 "ipv6"
> #define CMD_LINE_OPT_ENABLE_JUMBO "enable-jumbo"
> #define CMD_LINE_OPT_HASH_ENTRY_NUM "hash-entry-num"
> +#define CMD_LINE_OPT_PARSE_PTYPE "parse-ptype"
>
> /*
> * This expression is used to calculate the number of mbufs needed
> @@ -486,6 +511,7 @@ parse_args(int argc, char **argv)
> {CMD_LINE_OPT_IPV6, 0, 0, 0},
> {CMD_LINE_OPT_ENABLE_JUMBO, 0, 0, 0},
> {CMD_LINE_OPT_HASH_ENTRY_NUM, 1, 0, 0},
> + {CMD_LINE_OPT_PARSE_PTYPE, 0, 0, 0},
> {NULL, 0, 0, 0}
> };
>
> @@ -612,6 +638,14 @@ parse_args(int argc, char **argv)
> return -1;
> }
> }
> +
> + if (!strncmp(lgopts[option_index].name,
> + CMD_LINE_OPT_PARSE_PTYPE,
> + sizeof(CMD_LINE_OPT_PARSE_PTYPE))) {
> + printf("soft parse-ptype is enabled\n");
> + parse_ptype = 1;
> + }
> +
> break;
>
> default:
> @@ -938,6 +972,22 @@ main(int argc, char **argv)
> rte_exit(EXIT_FAILURE,
> "rte_eth_rx_queue_setup: err=%d, port=%d\n",
> ret, portid);
> +
> + ret = l3fwd_lkp.check_ptype(portid);
> + if (ret)
> + continue;
> + if (!parse_ptype)
> + rte_exit(EXIT_FAILURE,
> + "port %d cannot parse packet type, please add --%s\n",
> + portid, CMD_LINE_OPT_PARSE_PTYPE);
> +
> + if (rte_eth_add_rx_callback(portid, queueid,
> + cb_parse_packet_type,
> + NULL))
> + continue;
> + rte_exit(EXIT_FAILURE,
> + "Failed to add rx callback: port=%d\n",
> + portid);
> }
> }
>
> --
> 2.1.4
More information about the dev
mailing list