RE: 回复: [RFC PATCH v1 0/4] Direct re-arming of buffers on receive side
Honnappa Nagarahalli
Honnappa.Nagarahalli at arm.com
Thu Jan 27 06:16:13 CET 2022
<snip>
>
> [quick summary: ethdev API to bypass mempool]
>
> 18/01/2022 16:51, Ferruh Yigit:
> > On 12/28/2021 6:55 AM, Feifei Wang wrote:
> > > Morten Brørup <mb at smartsharesystems.com>:
> > >> The patch provides a significant performance improvement, but I am
> > >> wondering if any real world applications exist that would use this.
> > >> Only a "router on a stick" (i.e. a single-port router) comes to my
> > >> mind, and that is probably sufficient to call it useful in the real
> > >> world. Do you have any other examples to support the usefulness of this
> patch?
> > >>
> > > One case I have is about network security. For network firewall, all
> > > packets need to ingress on the specified port and egress on the specified
> port to do packet filtering.
> > > In this case, we can know flow direction in advance.
> >
> > I also have some concerns on how useful this API will be in real life,
> > and does the use case worth the complexity it brings.
> > And it looks too much low level detail for the application.
I think the application writer already needs to know many low level details to be able to extract performance out of PMDs. For ex: fast free,
>
> That's difficult to judge.
> The use case is limited and the API has some severe limitations.
The use case applies for SmartNICs which is a major use case. In terms of limitations, it depends on how one sees it. For ex: lcore cache is not applicable to pipeline mode, but it is still accepted as it is helpful for something else.
> The benefit is measured with l3fwd, which is not exactly a real app.
It is funny how we treat l3fwd. When it shows performance improvement, we treat it as 'not a real application'. When it shows (even a small) performance drop, the patches are not accepted. We need to make up our mind 😊
> Do we want an API which improves performance in limited scenarios at the
> cost of breaking some general design assumptions?
It is not breaking any existing design assumptions. It is a very well suited optimization for SmartNIC use case. For this use case, it does not make sense for the same thread to copy data to a temp location (lcore cache), read it immediately and store it in another location. It is a waste of CPU cycles and memory bandwidth.
>
> Can we achieve the same level of performance with a mempool trick?
We cannot as this patch basically avoids memory loads and stores (which reduces the backend stalls) caused by the temporary storage in lcore cache.
>
More information about the dev
mailing list