[PATCH v4] vhost: support CPU copy for small packets
Morten Brørup
mb at smartsharesystems.com
Wed Sep 7 16:47:13 CEST 2022
> From: Wenwu Ma [mailto:wenwux.ma at intel.com]
> Sent: Monday, 29 August 2022 02.57
>
> Offloading small packets to DMA degrades throughput 10%~20%,
> and this is because DMA offloading is not free and DMA is not
> good at processing small packets. In addition, control plane
> packets are usually small, and assign those packets to DMA will
> significantly increase latency, which may cause timeout like
> TCP handshake packets. Therefore, this patch use CPU to perform
> small copies in vhost.
>
> Signed-off-by: Wenwu Ma <wenwux.ma at intel.com>
> ---
[...]
> diff --git a/lib/vhost/virtio_net.c b/lib/vhost/virtio_net.c
> index 35fa4670fd..cf796183a0 100644
> --- a/lib/vhost/virtio_net.c
> +++ b/lib/vhost/virtio_net.c
> @@ -26,6 +26,8 @@
>
> #define MAX_BATCH_LEN 256
>
> +#define CPU_COPY_THRESHOLD_LEN 256
This threshold may not be optimal for all CPU architectures and/or DMA engines.
Could you please provide a test application to compare the performance of DMA copy with CPU rte_memcpy?
The performance metric should be simple: How many cycles does the CPU spend copying various packet sizes using each the two methods.
You could provide test_dmadev_perf.c in addition to the existing test_dmadev.c.
You can probably copy a some of the concepts and code from test_memcpy_perf.c.
Alternatively, you might be able to add DMA copy to test_memcpy_perf.c.
I'm sorry to push this on you - it should have been done as part of DMAdev development already.
-Morten
More information about the dev
mailing list