[PATCH v1 2/3] net/af_packet: RX/TX rte_memcpy, bulk free, prefetch
Morten Brørup
mb at smartsharesystems.com
Wed Jan 28 10:49:54 CET 2026
> > > - Replace memcpy() with rte_memcpy() for optimized copy operations
> > There is no good reason that rte_memcpy() should be faster than
> memcpy().
> > There were some cases observed with virtio but my hunch is that this
> is
> > because the two routines are making different alignment assumptions.
>
> ack. I will drop rte_memcpy.
The community is increasingly skeptical about using rte_memcpy() instead of memcpy().
I'm not sure all DPDK documentation has been updated to reflect this change, but might still recommend rte_memcpy().
So, simply replacing memcpy() with rte_memcpy() is no longer acceptable.
However, if you back up the replacement with performance data, it is more likely to get accepted.
> Under what scenarios is rte_memcpy preferred/beneficial?
I wish someone had an answer to that question!
The best I can come up with is:
When using an ancient compiler or C library, where memcpy() isn't properly optimized.
With modern compilers catching up, rte_memcpy() is becoming increasingly obsolete.
Here's some background information about rte_memcpy() from 2017:
https://www.intel.com/content/www/us/en/developer/articles/technical/performance-optimization-of-memcpy-in-dpdk.html
IIRC, the concept of a specialized memcpy() originates from some video streaming or gaming code, where huge memory areas were being copied around.
More information about the dev
mailing list