[dpdk-dev] rte_memcpy - fence and stream

Morten Brørup mb at smartsharesystems.com
Thu May 27 20:15:19 CEST 2021


> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Bruce Richardson
> Sent: Thursday, 27 May 2021 19.22
> 
> On Thu, May 27, 2021 at 10:39:59PM +0530, Manish Sharma wrote:
> >    For the case I have, hardly 2% of the data buffers which are being
> >    copied get looked at - mostly its for DMA. Having a version of
> DPDK
> >    memcopy that does non temporal copies would definitely be good.
> >    If in my case, I have a lot of CPUs doing the copy in parallel,
> would
> >    I/OAT driver copy accelerator still help?
> >
> It will depend upon the size of the copies being done. For bigger
> packets
> the accelerator can help free up CPU cycles for other things.
> 
> However, if only 2% of the data which is being copied gets looked at,
> why
> does it need to be copied? Can the original buffers not be used in that
> case?

I can only speak for myself here...

Our firmware has a packet capture feature with a filter.

If a packet matches the capture filter, a metadata header and the relevant part of the packet contents ("snap length" in tcpdump terminology) is appended to a large memory area (the "capture buffer") using rte_pktmbuf_read/rte_memcpy. This capture buffer is only read through the GUI or management API by the network administrator, i.e. it will only be read minutes or hours later, so there is no need to put any of it in any CPU cache.

It does not make sense to clone and hold on to many thousands of mbufs when we only need some of their contents. So we copy the contents instead of increasing the mbuf refcount.

We currently only use our packet capture feature for R&D purposes, so we have not optimized it yet. However, we will need to optimize it for production use at some point. So I find this discussion initiated by Manish very interesting.

-Morten



More information about the dev mailing list