[dpdk-dev] [PATCH v5 00/29] graph: introduce graph subsystem
Tom Barbette
barbette at kth.se
Thu Apr 30 10:07:43 CEST 2020
Hi all,
I could not check all discussions regarding the graph subsystem, but I
could not find a trivia behind the idea of re-creating yet another graph
processing system like VPP, BESS, Click/FastClick and a few others that
all support DPDK already and comes with up to thousands of "nodes"
already built?
Is there something fundamentally better than those? Or this is just to
provide a clean in-house API?
Thanks,
Tom
Le 11/04/2020 à 16:13, jerinj at marvell.com a écrit :
> From: Jerin Jacob <jerinj at marvell.com>
>
> Using graph traversal for packet processing is a proven architecture
> that has been implemented in various open source libraries.
>
> Graph architecture for packet processing enables abstracting the data
> processing functions as “nodes” and “links” them together to create a
> complex “graph” to create reusable/modular data processing functions.
>
> The patchset further includes performance enhancements and modularity
> to the DPDK as discussed in more detail below.
>
> v5..v4:
> ------
> Addressed the following review comments from Andrzej Ostruszka.
>
> 1) Addressed and comment in (http://mails.dpdk.org/archives/dev/2020-April/162184.html)
> and improved following function prototypes/return types and adjusted the
> implementation
> a) rte_graph_node_get
> b) rte_graph_max_count
> c) rte_graph_export
> d) rte_graph_destroy
> 2) Updated UT and l3fwd-graph for updated function prototype
> 3) bug fix in edge_update
> 4) avoid reading graph_src_nodes_count() twice in rte_graph_create()
> 5) Fix graph_mem_fixup_secondray typo
> 6) Fixed Doxygen comments for rte_node_next_stream_put
> 7) Updated the documentation to reflect the same.
> 8) Removed RTE prefix from rte_node_mbuf_priv[1|2] * as they are
> internal defines
> 9) Limited next_hop id provided to LPM route add in
> librte_node/ip4_lookup.c to 24 bits ()
> 10) Fixed pattern array overflow issue with l3fwd-graph/main.c by
> splitting pattern
> array to default + non-default array. Updated doc with the same info.
> 11) Fixed parsing issues in parse_config() in l3fwd-graph/main.c inline
> with issues reported
> in l2fwd-event
> 12)Removed next_hop field in l3fwd-graph/main.c main()
> 13) Fixed graph create error check in l3fwd-graph/main.c main()
>
> v4..v3:
> -------
> Addressed the following review comments from Wang, Xiao W
>
> 1) Remove unnecessary line from rte_graph.h
> 2) Fix a typo from rte_graph.h
> 3) Move NODE_ID_CHECK to 3rd patch where it is first used.
> 4) Fixed bug in edge_update()
>
> v3..v2:
> -------
> 1) refactor ipv4 node lookup by moving SSE and NEON specific code to
> lib/librte_node/ip4_lookup_sse.h and lib/librte_node/ip4_lookup_neon.h
> 2) Add scalar version of process() function for ipv4 lookup to make
> the node work on NON x86 and arm64 machines.
>
> v2..v1:
> ------
> 1) Added programmer guide/implementation documentation and l3fwd-graph doc
>
> RFC..v1:
> --------
>
> 1) Split the patch to more logical ones for review.
> 2) Added doxygen comments for the API
> 3) Code cleanup
> 4) Additional performance improvements.
> Delta between l3fwd and l3fwd-graph is negligible now.
> (~1%) on octeontx2.
> 5) Added SIMD routines for x86 in additional to arm64.
>
> Hosted in netlify for easy reference:
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> Programmer’s Guide:
> https://dpdk-graph.netlify.com/doc/html/guides/prog_guide/graph_lib.html
>
> l3fwd-graph doc:
> https://dpdk-graph.netlify.com/doc/html/guides/sample_app_ug/l3_forward_graph.html
>
> API doc:
> https://dpdk-graph.netlify.com/doc/html/api/rte__graph_8h.html
> https://dpdk-graph.netlify.com/doc/html/api/rte__graph__worker_8h.html
> https://dpdk-graph.netlify.com/doc/html/api/rte__node__eth__api_8h.html
> https://dpdk-graph.netlify.com/doc/html/api/rte__node__ip4__api_8h.html
>
> 2) Added the release notes for the this feature
>
> 3) Fix build issues reported by CI for v1:
> http://mails.dpdk.org/archives/test-report/2020-March/121326.html
>
>
> Addional nodes planned for v20.08
> ----------------------------------
> 1) Packet classification node
> 2) Support for IPV6 LPM node
>
>
> This patchset contains
> -----------------------------
> 1) The API definition to "create" nodes and "link" together to create a
> "graph" for packet processing. See, lib/librte_graph/rte_graph.h
>
> 2) The Fast path API definition for the graph walker and enqueue
> function used by the workers. See, lib/librte_graph/rte_graph_worker.h
>
> 3) Optimized SW implementation for (1) and (2). See, lib/librte_graph/
>
> 4) Test case to verify the graph infrastructure functionality
> See, app/test/test_graph.c
>
> 5) Performance test cases to evaluate the cost of graph walker and nodes
> enqueue fast-path function for various combinations.
>
> See app/test/test_graph_perf.c
>
> 6) Packet processing nodes(Null, Rx, Tx, Pkt drop, IPV4 rewrite, IPv4
> lookup)
> using graph infrastructure. See lib/librte_node/*
>
> 7) An example application to showcase l3fwd
> (functionality same as existing examples/l3fwd) using graph
> infrastructure and use packets processing nodes (item (6)). See examples/l3fwd-graph/.
>
> Performance
> -----------
> 1) Graph walk and node enqueue overhead can be tested with performance
> test case application [1]
> # If all packets go from a node to another node (we call it as
> # "homerun") then it will be just a pointer swap for a burst of packets.
> # In the worst case, a couple of handful cycles to move an object from a
> node to another node.
>
> 2) Performance comparison with existing l3fwd (The complete static code
> with out any nodes) vs modular l3fwd-graph with 5 nodes
> (ip4_lookup, ip4_rewrite, ethdev_tx, ethdev_rx, pkt_drop).
> Here is graphical representation of the l3fwd-graph as Graphviz dot
> file:
> http://bit.ly/39UPPGm
>
> # l3fwd-graph performance is -1.2% wrt static l3fwd.
>
> # We have simulated the similar test with existing librte_pipeline
> # application [4].
> ip_pipline application is -48.62% wrt static l3fwd.
>
> The above results are on octeontx2. It may vary on other platforms.
> The platforms with higher L1 and L2 caches will have further better
> performance.
>
>
> Tested architectures:
> --------------------
> 1) AArch64
> 2) X86
>
>
> Identified tweaking for better performance on different targets
> ---------------------------------------------------------------
> 1) Test with various burst size values (256, 128, 64, 32) using
> CONFIG_RTE_GRAPH_BURST_SIZE config option.
> Based on our testing, on x86 and arm64 servers, The sweet spot is 256
> burst size.
> While on arm64 embedded SoCs, it is either 64 or 128.
>
> 2) Disable node statistics (use CONFIG_RTE_LIBRTE_GRAPH_STATS config
> option)
> if not needed.
>
> 3) Use arm64 optimized memory copy for arm64 architecture by
> selecting CONFIG_RTE_ARCH_ARM64_MEMCPY.
>
> Commands to run tests
> ---------------------
>
> [1]
> perf test:
> echo "graph_perf_autotest" | sudo ./build/app/test/dpdk-test -c 0x30
>
> [2]
> functionality test:
> echo "graph_autotest" | sudo ./build/app/test/dpdk-test -c 0x30
>
> [3]
> l3fwd-graph:
> ./l3fwd-graph -c 0x100 -- -p 0x3 --config="(0, 0, 8)" -P
>
> [4]
> # ./ip_pipeline --c 0xff0000 -- -s route.cli
>
> Route.cli: (Copy paste to the shell to avoid dos format issues)
>
> https://pastebin.com/raw/B4Ktx7TT
>
> Jerin Jacob (13):
> graph: define the public API for graph support
> graph: implement node registration
> graph: implement node operations
> graph: implement node debug routines
> graph: implement internal graph operation helpers
> graph: populate fastpath memory for graph reel
> graph: implement create and destroy APIs
> graph: implement graph operation APIs
> graph: implement Graphviz export
> graph: implement debug routines
> graph: implement stats support
> graph: implement fastpath API routines
> doc: add graph library programmer's guide guide
>
> Kiran Kumar K (2):
> graph: add unit test case
> node: add ipv4 rewrite node
>
> Nithin Dabilpuram (11):
> node: add log infra and null node
> node: add ethdev Rx node
> node: add ethdev Tx node
> node: add ethdev Rx and Tx node ctrl API
> node: ipv4 lookup for arm64
> node: add ipv4 rewrite and lookup ctrl API
> node: add packet drop node
> l3fwd-graph: add graph based l3fwd skeleton
> l3fwd-graph: add ethdev configuration changes
> l3fwd-graph: add graph config and main loop
> doc: add l3fwd graph application user guide
>
> Pavan Nikhilesh (3):
> graph: add performance testcase
> node: add generic ipv4 lookup node
> node: ipv4 lookup for x86
>
> MAINTAINERS | 14 +
> app/test/Makefile | 7 +
> app/test/meson.build | 12 +-
> app/test/test_graph.c | 819 ++++
> app/test/test_graph_perf.c | 1057 ++++++
> config/common_base | 12 +
> config/rte_config.h | 4 +
> doc/api/doxy-api-index.md | 5 +
> doc/api/doxy-api.conf.in | 2 +
> doc/guides/prog_guide/graph_lib.rst | 397 ++
> .../prog_guide/img/anatomy_of_a_node.svg | 1078 ++++++
> .../prog_guide/img/graph_mem_layout.svg | 702 ++++
> doc/guides/prog_guide/img/link_the_nodes.svg | 3330 +++++++++++++++++
> doc/guides/prog_guide/index.rst | 1 +
> doc/guides/rel_notes/release_20_05.rst | 32 +
> doc/guides/sample_app_ug/index.rst | 1 +
> doc/guides/sample_app_ug/intro.rst | 4 +
> doc/guides/sample_app_ug/l3_forward_graph.rst | 334 ++
> examples/Makefile | 3 +
> examples/l3fwd-graph/Makefile | 58 +
> examples/l3fwd-graph/main.c | 1126 ++++++
> examples/l3fwd-graph/meson.build | 13 +
> examples/meson.build | 6 +-
> lib/Makefile | 6 +
> lib/librte_graph/Makefile | 28 +
> lib/librte_graph/graph.c | 587 +++
> lib/librte_graph/graph_debug.c | 84 +
> lib/librte_graph/graph_ops.c | 169 +
> lib/librte_graph/graph_populate.c | 234 ++
> lib/librte_graph/graph_private.h | 347 ++
> lib/librte_graph/graph_stats.c | 406 ++
> lib/librte_graph/meson.build | 11 +
> lib/librte_graph/node.c | 421 +++
> lib/librte_graph/rte_graph.h | 668 ++++
> lib/librte_graph/rte_graph_version.map | 47 +
> lib/librte_graph/rte_graph_worker.h | 510 +++
> lib/librte_node/Makefile | 32 +
> lib/librte_node/ethdev_ctrl.c | 115 +
> lib/librte_node/ethdev_rx.c | 221 ++
> lib/librte_node/ethdev_rx_priv.h | 81 +
> lib/librte_node/ethdev_tx.c | 86 +
> lib/librte_node/ethdev_tx_priv.h | 62 +
> lib/librte_node/ip4_lookup.c | 215 ++
> lib/librte_node/ip4_lookup_neon.h | 238 ++
> lib/librte_node/ip4_lookup_sse.h | 244 ++
> lib/librte_node/ip4_rewrite.c | 326 ++
> lib/librte_node/ip4_rewrite_priv.h | 77 +
> lib/librte_node/log.c | 14 +
> lib/librte_node/meson.build | 10 +
> lib/librte_node/node_private.h | 79 +
> lib/librte_node/null.c | 23 +
> lib/librte_node/pkt_drop.c | 26 +
> lib/librte_node/rte_node_eth_api.h | 64 +
> lib/librte_node/rte_node_ip4_api.h | 78 +
> lib/librte_node/rte_node_version.map | 9 +
> lib/meson.build | 5 +-
> meson.build | 1 +
> mk/rte.app.mk | 2 +
> 58 files changed, 14538 insertions(+), 5 deletions(-)
> create mode 100644 app/test/test_graph.c
> create mode 100644 app/test/test_graph_perf.c
> create mode 100644 doc/guides/prog_guide/graph_lib.rst
> create mode 100644 doc/guides/prog_guide/img/anatomy_of_a_node.svg
> create mode 100644 doc/guides/prog_guide/img/graph_mem_layout.svg
> create mode 100644 doc/guides/prog_guide/img/link_the_nodes.svg
> create mode 100644 doc/guides/sample_app_ug/l3_forward_graph.rst
> create mode 100644 examples/l3fwd-graph/Makefile
> create mode 100644 examples/l3fwd-graph/main.c
> create mode 100644 examples/l3fwd-graph/meson.build
> create mode 100644 lib/librte_graph/Makefile
> create mode 100644 lib/librte_graph/graph.c
> create mode 100644 lib/librte_graph/graph_debug.c
> create mode 100644 lib/librte_graph/graph_ops.c
> create mode 100644 lib/librte_graph/graph_populate.c
> create mode 100644 lib/librte_graph/graph_private.h
> create mode 100644 lib/librte_graph/graph_stats.c
> create mode 100644 lib/librte_graph/meson.build
> create mode 100644 lib/librte_graph/node.c
> create mode 100644 lib/librte_graph/rte_graph.h
> create mode 100644 lib/librte_graph/rte_graph_version.map
> create mode 100644 lib/librte_graph/rte_graph_worker.h
> create mode 100644 lib/librte_node/Makefile
> create mode 100644 lib/librte_node/ethdev_ctrl.c
> create mode 100644 lib/librte_node/ethdev_rx.c
> create mode 100644 lib/librte_node/ethdev_rx_priv.h
> create mode 100644 lib/librte_node/ethdev_tx.c
> create mode 100644 lib/librte_node/ethdev_tx_priv.h
> create mode 100644 lib/librte_node/ip4_lookup.c
> create mode 100644 lib/librte_node/ip4_lookup_neon.h
> create mode 100644 lib/librte_node/ip4_lookup_sse.h
> create mode 100644 lib/librte_node/ip4_rewrite.c
> create mode 100644 lib/librte_node/ip4_rewrite_priv.h
> create mode 100644 lib/librte_node/log.c
> create mode 100644 lib/librte_node/meson.build
> create mode 100644 lib/librte_node/node_private.h
> create mode 100644 lib/librte_node/null.c
> create mode 100644 lib/librte_node/pkt_drop.c
> create mode 100644 lib/librte_node/rte_node_eth_api.h
> create mode 100644 lib/librte_node/rte_node_ip4_api.h
> create mode 100644 lib/librte_node/rte_node_version.map
>
More information about the dev
mailing list