[dpdk-dev] [PATCH v5 00/29] graph: introduce graph subsystem

Tom Barbette barbette at kth.se
Thu Apr 30 10:07:43 CEST 2020


Hi all,

I could not check all discussions regarding the graph subsystem, but I 
could not find a trivia behind the idea of re-creating yet another graph 
processing system like VPP, BESS, Click/FastClick and a few others that 
all support DPDK already and comes with up to thousands of "nodes" 
already built?

Is there something fundamentally better than those? Or this is just to 
provide a clean in-house API?

Thanks,

Tom

Le 11/04/2020 à 16:13, jerinj at marvell.com a écrit :
> From: Jerin Jacob <jerinj at marvell.com>
> 
> Using graph traversal for packet processing is a proven architecture
> that has been implemented in various open source libraries.
> 
> Graph architecture for packet processing enables abstracting the data
> processing functions as “nodes” and “links” them together to create a
> complex “graph” to create reusable/modular data processing functions.
> 
> The patchset further includes performance enhancements and modularity
> to the DPDK as discussed in more detail below.
> 
> v5..v4:
> ------
> Addressed the following review comments from Andrzej Ostruszka.
> 
> 1) Addressed and comment in (http://mails.dpdk.org/archives/dev/2020-April/162184.html)
> and improved following function prototypes/return types and adjusted the
> implementation
> a) rte_graph_node_get
> b) rte_graph_max_count
> c) rte_graph_export
> d) rte_graph_destroy
> 2) Updated UT and l3fwd-graph for updated function prototype
> 3) bug fix in edge_update
> 4) avoid reading graph_src_nodes_count() twice in rte_graph_create()
> 5) Fix graph_mem_fixup_secondray typo
> 6) Fixed Doxygen comments for rte_node_next_stream_put
> 7) Updated the documentation to reflect the same.
> 8) Removed RTE prefix from rte_node_mbuf_priv[1|2] * as they are
> internal defines
> 9) Limited next_hop id provided to LPM route add in
> librte_node/ip4_lookup.c to 24 bits ()
> 10) Fixed pattern array overflow issue with l3fwd-graph/main.c by
> splitting pattern
> array to default + non-default array. Updated doc with the same info.
> 11) Fixed parsing issues in parse_config() in l3fwd-graph/main.c inline
> with issues reported
> in l2fwd-event
> 12)Removed next_hop field in l3fwd-graph/main.c main()
> 13) Fixed graph create error check in l3fwd-graph/main.c main()
> 
> v4..v3:
> -------
> Addressed the following review comments from Wang, Xiao W
> 
> 1) Remove unnecessary line from rte_graph.h
> 2) Fix a typo from rte_graph.h
> 3) Move NODE_ID_CHECK to 3rd patch where it is first used.
> 4) Fixed bug in edge_update()
> 
> v3..v2:
> -------
> 1) refactor ipv4 node lookup by moving SSE and NEON specific code to
> lib/librte_node/ip4_lookup_sse.h and lib/librte_node/ip4_lookup_neon.h
> 2) Add scalar version of process() function for ipv4 lookup to make
> the node work on NON x86 and arm64 machines.
> 
> v2..v1:
> ------
> 1) Added programmer guide/implementation documentation and l3fwd-graph doc
> 
> RFC..v1:
> --------
> 
> 1) Split the patch to more logical ones for review.
> 2) Added doxygen comments for the API
> 3) Code cleanup
> 4) Additional performance improvements.
> Delta between l3fwd and l3fwd-graph is negligible now.
> (~1%) on octeontx2.
> 5) Added SIMD routines for x86 in additional to arm64.
> 
> Hosted in netlify for easy reference:
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> Programmer’s Guide:
> https://dpdk-graph.netlify.com/doc/html/guides/prog_guide/graph_lib.html
> 
> l3fwd-graph doc:
> https://dpdk-graph.netlify.com/doc/html/guides/sample_app_ug/l3_forward_graph.html
> 
> API doc:
> https://dpdk-graph.netlify.com/doc/html/api/rte__graph_8h.html
> https://dpdk-graph.netlify.com/doc/html/api/rte__graph__worker_8h.html
> https://dpdk-graph.netlify.com/doc/html/api/rte__node__eth__api_8h.html
> https://dpdk-graph.netlify.com/doc/html/api/rte__node__ip4__api_8h.html
> 
> 2) Added the release notes for the this feature
> 
> 3) Fix build issues reported by CI for v1:
> http://mails.dpdk.org/archives/test-report/2020-March/121326.html
> 
> 
> Addional nodes planned for v20.08
> ----------------------------------
> 1) Packet classification node
> 2) Support for IPV6 LPM node
> 
> 
> This patchset contains
> -----------------------------
> 1) The API definition to "create" nodes and "link" together to create a
> "graph" for packet processing. See, lib/librte_graph/rte_graph.h
> 
> 2) The Fast path API definition for the graph walker and enqueue
> function used by the workers. See, lib/librte_graph/rte_graph_worker.h
> 
> 3) Optimized SW implementation for (1) and (2). See, lib/librte_graph/
> 
> 4) Test case to verify the graph infrastructure functionality
> See, app/test/test_graph.c
>   
> 5) Performance test cases to evaluate the cost of graph walker and nodes
> enqueue fast-path function for various combinations.
> 
> See app/test/test_graph_perf.c
> 
> 6) Packet processing nodes(Null, Rx, Tx, Pkt drop, IPV4 rewrite, IPv4
> lookup)
> using graph infrastructure. See lib/librte_node/*
> 
> 7) An example application to showcase l3fwd
> (functionality same as existing examples/l3fwd) using graph
> infrastructure and use packets processing nodes (item (6)). See examples/l3fwd-graph/.
> 
> Performance
> -----------
> 1) Graph walk and node enqueue overhead can be tested with performance
> test case application [1]
> # If all packets go from a node to another node (we call it as
> # "homerun") then it will be just a pointer swap for a burst of packets.
> # In the worst case, a couple of handful cycles to move an object from a
> node to another node.
> 
> 2) Performance comparison with existing l3fwd (The complete static code
> with out any nodes) vs modular l3fwd-graph with 5 nodes
> (ip4_lookup, ip4_rewrite, ethdev_tx, ethdev_rx, pkt_drop).
> Here is graphical representation of the l3fwd-graph as Graphviz dot
> file:
> http://bit.ly/39UPPGm
> 
> # l3fwd-graph performance is -1.2% wrt static l3fwd.
> 
> # We have simulated the similar test with existing librte_pipeline
> # application [4].
> ip_pipline application is -48.62% wrt static l3fwd.
> 
> The above results are on octeontx2. It may vary on other platforms.
> The platforms with higher L1 and L2 caches will have further better
> performance.
> 
> 
> Tested architectures:
> --------------------
> 1) AArch64
> 2) X86
> 
> 
> Identified tweaking for better performance on different targets
> ---------------------------------------------------------------
> 1) Test with various burst size values (256, 128, 64, 32) using
> CONFIG_RTE_GRAPH_BURST_SIZE config option.
> Based on our testing, on x86 and arm64 servers, The sweet spot is 256
> burst size.
> While on arm64 embedded SoCs, it is either 64 or 128.
> 
> 2) Disable node statistics (use CONFIG_RTE_LIBRTE_GRAPH_STATS config
> option)
> if not needed.
> 
> 3) Use arm64 optimized memory copy for arm64 architecture by
> selecting CONFIG_RTE_ARCH_ARM64_MEMCPY.
> 
> Commands to run tests
> ---------------------
> 
> [1]
> perf test:
> echo "graph_perf_autotest" | sudo ./build/app/test/dpdk-test -c 0x30
> 
> [2]
> functionality test:
> echo "graph_autotest" | sudo ./build/app/test/dpdk-test -c 0x30
> 
> [3]
> l3fwd-graph:
> ./l3fwd-graph -c 0x100  -- -p 0x3 --config="(0, 0, 8)" -P
> 
> [4]
> # ./ip_pipeline --c 0xff0000 -- -s route.cli
> 
> Route.cli: (Copy paste to the shell to avoid dos format issues)
> 
> https://pastebin.com/raw/B4Ktx7TT
> 
> Jerin Jacob (13):
>    graph: define the public API for graph support
>    graph: implement node registration
>    graph: implement node operations
>    graph: implement node debug routines
>    graph: implement internal graph operation helpers
>    graph: populate fastpath memory for graph reel
>    graph: implement create and destroy APIs
>    graph: implement graph operation APIs
>    graph: implement Graphviz export
>    graph: implement debug routines
>    graph: implement stats support
>    graph: implement fastpath API routines
>    doc: add graph library programmer's guide guide
> 
> Kiran Kumar K (2):
>    graph: add unit test case
>    node: add ipv4 rewrite node
> 
> Nithin Dabilpuram (11):
>    node: add log infra and null node
>    node: add ethdev Rx node
>    node: add ethdev Tx node
>    node: add ethdev Rx and Tx node ctrl API
>    node: ipv4 lookup for arm64
>    node: add ipv4 rewrite and lookup ctrl API
>    node: add packet drop node
>    l3fwd-graph: add graph based l3fwd skeleton
>    l3fwd-graph: add ethdev configuration changes
>    l3fwd-graph: add graph config and main loop
>    doc: add l3fwd graph application user guide
> 
> Pavan Nikhilesh (3):
>    graph: add performance testcase
>    node: add generic ipv4 lookup node
>    node: ipv4 lookup for x86
> 
>   MAINTAINERS                                   |   14 +
>   app/test/Makefile                             |    7 +
>   app/test/meson.build                          |   12 +-
>   app/test/test_graph.c                         |  819 ++++
>   app/test/test_graph_perf.c                    | 1057 ++++++
>   config/common_base                            |   12 +
>   config/rte_config.h                           |    4 +
>   doc/api/doxy-api-index.md                     |    5 +
>   doc/api/doxy-api.conf.in                      |    2 +
>   doc/guides/prog_guide/graph_lib.rst           |  397 ++
>   .../prog_guide/img/anatomy_of_a_node.svg      | 1078 ++++++
>   .../prog_guide/img/graph_mem_layout.svg       |  702 ++++
>   doc/guides/prog_guide/img/link_the_nodes.svg  | 3330 +++++++++++++++++
>   doc/guides/prog_guide/index.rst               |    1 +
>   doc/guides/rel_notes/release_20_05.rst        |   32 +
>   doc/guides/sample_app_ug/index.rst            |    1 +
>   doc/guides/sample_app_ug/intro.rst            |    4 +
>   doc/guides/sample_app_ug/l3_forward_graph.rst |  334 ++
>   examples/Makefile                             |    3 +
>   examples/l3fwd-graph/Makefile                 |   58 +
>   examples/l3fwd-graph/main.c                   | 1126 ++++++
>   examples/l3fwd-graph/meson.build              |   13 +
>   examples/meson.build                          |    6 +-
>   lib/Makefile                                  |    6 +
>   lib/librte_graph/Makefile                     |   28 +
>   lib/librte_graph/graph.c                      |  587 +++
>   lib/librte_graph/graph_debug.c                |   84 +
>   lib/librte_graph/graph_ops.c                  |  169 +
>   lib/librte_graph/graph_populate.c             |  234 ++
>   lib/librte_graph/graph_private.h              |  347 ++
>   lib/librte_graph/graph_stats.c                |  406 ++
>   lib/librte_graph/meson.build                  |   11 +
>   lib/librte_graph/node.c                       |  421 +++
>   lib/librte_graph/rte_graph.h                  |  668 ++++
>   lib/librte_graph/rte_graph_version.map        |   47 +
>   lib/librte_graph/rte_graph_worker.h           |  510 +++
>   lib/librte_node/Makefile                      |   32 +
>   lib/librte_node/ethdev_ctrl.c                 |  115 +
>   lib/librte_node/ethdev_rx.c                   |  221 ++
>   lib/librte_node/ethdev_rx_priv.h              |   81 +
>   lib/librte_node/ethdev_tx.c                   |   86 +
>   lib/librte_node/ethdev_tx_priv.h              |   62 +
>   lib/librte_node/ip4_lookup.c                  |  215 ++
>   lib/librte_node/ip4_lookup_neon.h             |  238 ++
>   lib/librte_node/ip4_lookup_sse.h              |  244 ++
>   lib/librte_node/ip4_rewrite.c                 |  326 ++
>   lib/librte_node/ip4_rewrite_priv.h            |   77 +
>   lib/librte_node/log.c                         |   14 +
>   lib/librte_node/meson.build                   |   10 +
>   lib/librte_node/node_private.h                |   79 +
>   lib/librte_node/null.c                        |   23 +
>   lib/librte_node/pkt_drop.c                    |   26 +
>   lib/librte_node/rte_node_eth_api.h            |   64 +
>   lib/librte_node/rte_node_ip4_api.h            |   78 +
>   lib/librte_node/rte_node_version.map          |    9 +
>   lib/meson.build                               |    5 +-
>   meson.build                                   |    1 +
>   mk/rte.app.mk                                 |    2 +
>   58 files changed, 14538 insertions(+), 5 deletions(-)
>   create mode 100644 app/test/test_graph.c
>   create mode 100644 app/test/test_graph_perf.c
>   create mode 100644 doc/guides/prog_guide/graph_lib.rst
>   create mode 100644 doc/guides/prog_guide/img/anatomy_of_a_node.svg
>   create mode 100644 doc/guides/prog_guide/img/graph_mem_layout.svg
>   create mode 100644 doc/guides/prog_guide/img/link_the_nodes.svg
>   create mode 100644 doc/guides/sample_app_ug/l3_forward_graph.rst
>   create mode 100644 examples/l3fwd-graph/Makefile
>   create mode 100644 examples/l3fwd-graph/main.c
>   create mode 100644 examples/l3fwd-graph/meson.build
>   create mode 100644 lib/librte_graph/Makefile
>   create mode 100644 lib/librte_graph/graph.c
>   create mode 100644 lib/librte_graph/graph_debug.c
>   create mode 100644 lib/librte_graph/graph_ops.c
>   create mode 100644 lib/librte_graph/graph_populate.c
>   create mode 100644 lib/librte_graph/graph_private.h
>   create mode 100644 lib/librte_graph/graph_stats.c
>   create mode 100644 lib/librte_graph/meson.build
>   create mode 100644 lib/librte_graph/node.c
>   create mode 100644 lib/librte_graph/rte_graph.h
>   create mode 100644 lib/librte_graph/rte_graph_version.map
>   create mode 100644 lib/librte_graph/rte_graph_worker.h
>   create mode 100644 lib/librte_node/Makefile
>   create mode 100644 lib/librte_node/ethdev_ctrl.c
>   create mode 100644 lib/librte_node/ethdev_rx.c
>   create mode 100644 lib/librte_node/ethdev_rx_priv.h
>   create mode 100644 lib/librte_node/ethdev_tx.c
>   create mode 100644 lib/librte_node/ethdev_tx_priv.h
>   create mode 100644 lib/librte_node/ip4_lookup.c
>   create mode 100644 lib/librte_node/ip4_lookup_neon.h
>   create mode 100644 lib/librte_node/ip4_lookup_sse.h
>   create mode 100644 lib/librte_node/ip4_rewrite.c
>   create mode 100644 lib/librte_node/ip4_rewrite_priv.h
>   create mode 100644 lib/librte_node/log.c
>   create mode 100644 lib/librte_node/meson.build
>   create mode 100644 lib/librte_node/node_private.h
>   create mode 100644 lib/librte_node/null.c
>   create mode 100644 lib/librte_node/pkt_drop.c
>   create mode 100644 lib/librte_node/rte_node_eth_api.h
>   create mode 100644 lib/librte_node/rte_node_ip4_api.h
>   create mode 100644 lib/librte_node/rte_node_version.map
> 


More information about the dev mailing list