[dpdk-dev] [PATCH v4 0/8] Introduce event vectorization
Jerin Jacob
jerinjacobk at gmail.com
Tue Mar 23 19:44:46 CET 2021
On Sat, Mar 20, 2021 at 2:27 AM <pbhagavatula at marvell.com> wrote:
>
> From: Pavan Nikhilesh <pbhagavatula at marvell.com>
>
> In traditional event programming model, events are identified by a
> flow-id and a uintptr_t. The flow-id uniquely identifies a given event
> and determines the order of scheduling based on schedule type, the
> uintptr_t holds a single object.
>
> Event devices also support burst mode with configurable dequeue depth,
> i.e. each dequeue call would return multiple events and each event
> might be at a different stage of the pipeline.
> Having a burst of events belonging to different stages in a dequeue
> burst is not only difficult to vectorize but also increases the scheduler
> overhead and application overhead of pipelining events further.
> Using event vectors we see a performance gain of ~628% as shown in [1].
>
> By introducing event vectorization, each event will be capable of holding
> multiple uintptr_t of the same flow thereby allowing applications
> to vectorize their pipeline and reduce the complexity of pipelining
> events across multiple stages. This also reduces the complexity of handling
> enqueue and dequeue on an event device.
>
> Since event devices are transparent to the events they are scheduling
> so the event producers such as eth_rx_adapter, crypto_adapter , etc..
> are responsible for vectorizing the buffers of the same flow into a single
> event.
>
> The series also breaks ABI in the patch [8/8] which is targetted to the
> v21.11 release.
>
> The dpdk-test-eventdev application has been updated with options to test
> multiple vector sizes and timeouts.
>
> [1]
> As for performance improvement, with a ARM Cortex-A72 equivalent processer,
> software event device (--vdev=event_sw0), single worker core, single stage
> and using one service core for Rx adapter, Tx adapter, Scheduling.
>
> Without event vectorization:
> ./build/app/dpdk-test-eventdev -l 7-23 -s 0x700 --vdev="event_sw0" --
> --prod_type_ethdev --nb_pkts=0 --verbose 2 --test=pipeline_queue
> --stlist=a --wlcores=20
> Port[0] using Rx adapter[0] configured
> Port[0] using Tx adapter[0] Configured
> 4.728 mpps avg 4.728 mpps
>
> With event vectorization:
> ./build/app/dpdk-test-eventdev -l 7-23 -s 0x700 --vdev="event_sw0" --
> --prod_type_ethdev --nb_pkts=0 --verbose 2 --test=pipeline_queue
> --stlist=a --wlcores=20 --enable_vector --nb_eth_queues 1
> --vector_size 256
> Port[0] using Rx adapter[0] configured
> Port[0] using Tx adapter[0] Configured
> 34.383 mpps avg 34.383 mpps
>
> Having dedicated service cores for each Rx queues and tweaking the vector,
> dequeue burst size would further improve performance.
>
> API usage is shown below:
>
> Configuration:
>
> struct rte_event_eth_rx_adapter_event_vector_config vec_conf;
>
> vector_pool = rte_event_vector_pool_create("vector_pool",
> nb_elem, 0, vector_size, socket_id);
>
> rte_event_eth_rx_adapter_create(id, event_id, &adptr_conf);
> rte_event_eth_rx_adapter_queue_add(id, eth_id, -1, &queue_conf);
> if (cap & RTE_EVENT_ETH_RX_ADAPTER_CAP_EVENT_VECTOR) {
> vec_conf.vector_sz = vector_size;
> vec_conf.vector_timeout_ns = vector_tmo_nsec;
> vec_conf.vector_mp = vector_pool;
> rte_event_eth_rx_adapter_queue_event_vector_config(id,
> eth_id, -1, &vec_conf);
> }
>
> Fastpath:
>
> num = rte_event_dequeue_burst(event_id, port_id, &ev, 1, 0);
> if (!num)
> continue;
>
> if (ev.event_type & RTE_EVENT_TYPE_VECTOR) {
> switch (ev.event_type) {
> case RTE_EVENT_TYPE_ETHDEV_VECTOR:
> case RTE_EVENT_TYPE_ETH_RX_ADAPTER_VECTOR:
> struct rte_mbuf **mbufs;
>
> mbufs = ev.vector_ev->mbufs;
> for (i = 0; i < ev.vector_ev->nb_elem; i++)
> //Process mbufs.
> break;
> case ...
> }
> }
> ...
>
> v4 Changes:
> - Fix missing event vector structure in event structure.(Jay)
>
> v3 Changes:
> - Fix unintended formatting changes.
>
> v2 Changes:
> - Multiple gramatical and style fixes.(Jerin)
> - Add parameter to define vector size in power of 2. (Jerin)
> - Redo patch series w/o breaking ABI till the last patch.(David)
> - Add deprication notice to announce ABI break in 21.11.(David)
> - Add vector limits validation to app/test-eventdev.
>
> Pavan Nikhilesh (8):
> eventdev: introduce event vector capability
> eventdev: introduce event vector Rx capability
> eventdev: introduce event vector Tx capability
> eventdev: add Rx adapter event vector support
> eventdev: add Tx adapter event vector support
> app/eventdev: add event vector mode in pipeline test
> doc: announce event Rx adapter config changes
> eventdev: simplify Rx adapter event vector config
>
> app/test-eventdev/evt_common.h | 4 +
> app/test-eventdev/evt_options.c | 52 +++
> app/test-eventdev/evt_options.h | 4 +
> app/test-eventdev/test_pipeline_atq.c | 310 +++++++++++++++--
> app/test-eventdev/test_pipeline_common.c | 105 +++++-
> app/test-eventdev/test_pipeline_common.h | 18 +
> app/test-eventdev/test_pipeline_queue.c | 320 ++++++++++++++++--
> .../prog_guide/event_ethernet_rx_adapter.rst | 38 +++
> .../prog_guide/event_ethernet_tx_adapter.rst | 12 +
> doc/guides/prog_guide/eventdev.rst | 36 +-
> doc/guides/rel_notes/deprecation.rst | 9 +
> doc/guides/tools/testeventdev.rst | 28 ++
> lib/librte_eventdev/eventdev_pmd.h | 31 +-
> .../rte_event_eth_rx_adapter.c | 305 ++++++++++++++++-
> .../rte_event_eth_rx_adapter.h | 68 ++++
> .../rte_event_eth_tx_adapter.c | 66 +++-
> lib/librte_eventdev/rte_eventdev.c | 11 +-
> lib/librte_eventdev/rte_eventdev.h | 144 +++++++-
> lib/librte_eventdev/version.map | 4 +
> 19 files changed, 1479 insertions(+), 86 deletions(-)
Please update release notes(doc/guides/rel_notes/release_21_05.rst)
for this feature.
If there are no more comments on this series from others. IMO, Good to
merge the next series for RC1.
>
> --
> 2.17.1
>
More information about the dev
mailing list