[RFC v2] ethdev: an API for cache stashing hints
Stephen Hemminger
stephen at networkplumber.org
Wed Oct 23 22:18:40 CEST 2024
On Wed, 23 Oct 2024 19:59:35 +0200
Mattias Rönnblom <hofors at lysator.liu.se> wrote:
> > diff --git a/lib/ethdev/ethdev_driver.h b/lib/ethdev/ethdev_driver.h
> > index 883e59a927..b90dc8793b 100644
> > --- a/lib/ethdev/ethdev_driver.h
> > +++ b/lib/ethdev/ethdev_driver.h
> > @@ -1235,6 +1235,70 @@ typedef int (*eth_count_aggr_ports_t)(struct rte_eth_dev *dev);
> > typedef int (*eth_map_aggr_tx_affinity_t)(struct rte_eth_dev *dev, uint16_t tx_queue_id,
> > uint8_t affinity);
> >
> > +/**
> > + * @internal
> > + * Set cache stashing hint in the ethernet device.
> > + *
> > + * @param dev
> > + * Port (ethdev) handle.
> > + * @param cpuid
> > + * ID of the targeted CPU.
> > + * @param cache_level
> > + * Level of the cache to stash data.
>
> If we had a hwtopo API in DPDK, we could just use a node id in such a
> graph (of CPUs and caches) to describe were the data ideally would land.
> In such a case, you could have a node id for DDR as well, and thus you
> could drop the notion of "stashing". Just a "drop off the data here,
> please, if you can" API.
>
> I don't think this API and its documentation should talk about what the
> "CPU" needs, since it's somewhat misleading.
>
> For example, you can imagine you want the packet payload to land in the
> LLC, even though it's not for any CPU to consume, in case you know with
> some certaintly that the packet will soon be transmitted (and thus
> consumed by the NIC).
>
> The same scenario can happen, the consumer is an accelerator (e.g., a
> crypto engine).
>
> Likewise, you may know that the whole packet will be read by some CPU
> core, but you also know the system tends to buffer packets before they
> are being processed. In such a case, it's better to go to DRAM right
> away, to avoid trashing the LLC (or some other cache).
>
> Also, why do you need to use the word "host"? Seems like a PCI thing.
> This may be implemented in PCI, but surely can be done (and has been
> done) without PCI.
+1 for the concept of having a CPU and PCI topology map that
can be queried by drivers and application. Dumpster diving into sysfs
is hard to get right and keeps growing. I wonder if there exists an open
source library that is a good enough starting point for this already.
More information about the dev
mailing list